DNA encoding methymycin and pikromycin

ABSTRACT

A biosynthetic gene cluster for methymycin and pikromycin as well as a biosynthetic gene cluster for desosamine is provided.

STATEMENT OF GOVERNMENT RIGHTS

[0001] This invention was made with a grant from the Government of theUnited States of America (grants GM48562, GM35906 and GM54346 from theNational Institutes of Health and a grant from the Office of NavalResearch). The Government may have certain rights in the invention.

BACKGROUND OF THE INVENTION

[0002] Polyhydroxyalkanoates (PHAs) are one class of biodegradablepolymers. The first identified member of the PHAs thermoplastics waspolyhydroxybutyrate (PHB), the polymeric ester ofD(−)-3-hydroxybutyrate. The biosynthetic pathway of PHB in the gramnegative bacterium Alcaligenes eutrophus is depicted in FIG. 1. PHAsrelated to PHB differ in the structure of the pendant arm, R (FIG. 2).For example, R═CH₃ in PHB, while R═CH₂CH₃ in polyhydroxyvalerate, andR═(CH₂)₄CH₃ in polyhydroxyoctanoate.

[0003] The genes responsible for PHB synthesis in A. eutrophus have beencloned and sequenced. (Peoples et al., J. Biol. Chem., 264, 15293(1989); Peoples et al., J. Biol. Chem., 264, 15298 (1989)). Threeenzymes: β-ketothiolase (phbA), acetoacetyl-CoA reductase (phbB), andPHB synthase (phbC) are involved in the conversion of acetyl-CoA to PHB.The PHB synthase gene encodes a protein of M_(r)=63,900 which is activewhen introduced into E. coli (Peoples et al., J. Biol. Chem., 264, 15298(1989)).

[0004] Although PHB represents the archetypical form of a biodegradablethermoplastic, its physical properties preclude significant use of thehomopolymer form. Pure PHB is highly crystalline and, thus, verybrittle. However, unique physical properties resulting form thestructural characteristics of the R groups in a PHA copolymer may resultin a polymer with more desirable characteristics. These characteristicsinclude altered crystallinity, UV weathering resistance, glass to rubbertransition temperature (T_(g)), melting temperature of the crystallinephase, rigidity and durability (Holmes et al., EPO 00052 459; Andersonet al., Microbiol. Rev., 54, 450 (1990)). Thus, these polyesters behaveas thermoplastics, with melting temperatures of 50-180° C., which can beprocessed by conventional extension and molding equipment.

[0005] Traditional strategies for producing random PHA copolymersinvolve feeding short- and long-chain fatty acid monomers to bacterialcultures. However, this technology is limited by the monomer units whichcan be incorporated into a polymer by the endogenous PHA synthase andthe expense of manufacturing PHAs by existing fermentation methods(Haywood et al., FEMS Microbiol. Lett., 57, 1 (1989); Poi et al., Int.J. Biol. Macrornol., 12, 106 (1990); Steinbuchel et al., In: NovelBiomaterials from Biological Sources. D. Byron (ed.), MacMillan, N.Y.(1991); Valentin et al., Appl. Microbiol. Biotechnical, 36, 507 (1992)).

[0006] The production of diverse hydroxyacylCoA monomers for homo- andco-polymeric PHAs also occurs in some bacteria through the reduction andcondensation pathway of fatty acids. This pathway employs a fatty acidsynthase (FAS) which condenses malonate and acetate. The resultingβ-keto group undergoes three processing steps, β-keto reduction,dehydration, and enoyl reduction, to yield a fully saturated butyrylunit. However, this pathway provides only a limited array of PHAmonomers which vary in alkyl chain length but not in the degree of alkylgroup branching, saturation, or functionalization along the acyl chain.

[0007] The biosynthesis of polyketides, such as erythromycin, ismechanistically related to formation of long-chain fatty acids. However,polyketides, in contrast to FASs, retain ketone, hydroxyl, or olefinicfunctions and contain methyl or ethyl side groups interspersed along anacyl chain comparable in length to that of common fatty acids. Thisasymmetry in structure implies that the polyketide synthase (PKS), theenzyme system responsible for formation of these molecules, althoughmechanistically related to a FAS, results in an end product that isstructurally very different than that of a long-chain fatty acid.

[0008] Because PHAs are biodegradable polymers that have the versatilityto replace petrochemical-based thermoplastics, it is desirable that new,more economical methods be provided for the production of defined PHAs.Thus, what is needed are methods to produce recombinant PHA monomersynthases for the generation of PHA polymers.

[0009] Moreover, there is a continuing need for the identification andisolation of novel polyketide synthase genes, e.g., a polyketidesynthase which encodes polypeptides that synthesize an antibiotic suchas a macrolide.

SUMMARY OF THE INVENTION

[0010] The invention provides an isolated and purified nucleic acidsegment comprising a nucleic acid sequence comprising a sugar(desosamine) biosynthetic gene cluster, a biologically active variant orfragment thereof, wherein the nucleic acid sequence is not derived fromthe eryC gene cluster of Saccharopolyspora erythraea. As describedhereinbelow, the desosamine biosynthetic gene cluster from Streptomycesvenezuelae was isolated, cloned and sequenced. The isolated nucleic acidsegment comprising the gene cluster preferably includes a nucleic acidsequence comprising SEQ ID NO:3, or a fragment or variant thereof. Thecluster was found to encode nine polypeptides including DesI (e.g., SEQID NO:8 encoded by SEQ ID NO:7), DesII (e.g., SEQ ID NO:10 encoded bySEQ ID NO:9), DesIII (e.g., SEQ ID NO:12 encoded by SEQ ID NO:11), DesIV(e.g., SEQ ID NO:14 encoded by SEQ ID NO:13), DesV (e.g., SEQ ID NO:16encoded by SEQ ID NO:15), DesVI (e.g., SEQ ID NO: 18 encoded by SEQ IDNO: 17), DesVII (e.g., SEQ ID NO:20 encoded by SEQ ID NO:19), DesVIII(e.g., SEQ ID NO:22 encoded by SEQ ID NO:21), and DesR (e.g., SEQ IDNO:24 encoded by SEQ ID NO:23) (see FIG. 24). It is also preferred thatthe nucleic acid segment of the invention encoding DesR is not derivedfrom the eryB gene cluster of Saccharopolyspora erythraea or the oleDgene from Streptomyces antibioticus. Preferably, the nucleic acidsegment comprising the desosamine biosynthetic gene cluster hybridizesunder moderate, or more preferably stringent, hybridization conditionsto SEQ ID NO:3, or a fragment thereof. Moderate and stringenthybridization conditions are well known to the art, see, for examplesections 9.47-9.51 of Sambrook et al. (Molecular Cloning: A LaboratoryManual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989).For example, stringent conditions are those that (1) employ low ionicstrength and high temperature for washing, for example, 0.015 MNaCl/0.0015 M sodium citrate (SSC); 0.1% sodium lauryl sulfate (SDS) at50° C., or (2) employ a denaturing agent such as formamide duringhybridization, e.g., 50% formamide with 0.1% bovine serum albumin/0.1%Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5with 750 mM NaCl, 75 mM sodium citrate at 42° C. Another example is useof 50% formamide, 5×SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mMsodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5×Denhardt'ssolution, sonicated salmon sperm DNA (50 μg/ml), 0.1% sodiumdodecylsulfate (SDS), and 10% dextran sulfate at 42° C., with washes at42° C. in 0.2×SSC and 0.1% SDS.

[0011] The invention also provides a variant polypeptide having at leastabout 80%, more preferably at least about 90%, and even more preferablyat least about 95%, but less than 100%, contiguous amino acid sequenceidentity to the polypeptide having an amino acid sequence comprising SEQID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO: 16, SEQ IDNO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, or a fragment thereofA preferred variant polypeptide, or a subunit or fragment of apolypeptide, of the invention includes a variant or subunit polypeptidehaving at least about 1%, more preferably at least about 10%, and evenmore preferably at least about 50%, the activity of the polypeptidehaving the amino acid sequence comprising SEQ ID NO:8, SEQ ID NO:10, SEQID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ IDNO:22, or SEQ ID NO:24. Thus, for example, the glycosyltransferaseactivity of a polypeptide of SEQ ID NO:20 can be compared to a variantof SEQ ID NO:20 having at least one amino acid substitution, insertion,or deletion relative to SEQ ID NO:20.

[0012] A variant nucleic acid sequence of the invention has at leastabout 80%, more preferably at least about 90%, and even more preferablyat least about 95%, but less than 100%, contiguous nucleic acid sequenceidentity to a nucleic acid sequence comprising SEQ ID NO:3, SEQ D NO:7,SEQ ED NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQID NO:19, SEQ ID NO:21, SEQ ID NO:23, or a fragment thereof.

[0013] Also provided is an expression cassette comprising a nucleic acidsequence comprising a desosamine biosynthetic gene cluster, abiologically active variant or fragment thereof operably linked to apromoter functional in a host cell, as well as host cells comprising anexpression cassette of the invention. Thus, the expression cassettes ofthe invention are useful to express individual genes within the cluster,e.g., the desR gene which encodes a glycosidase or the des VII genewhich encodes a glycosyltransferase having relaxed substrate specificityfor polyketides and deoxysugars, i.e., the glycosyltransferase processessugar substrates other than TDP-desosamine. Thus, the desVII gene can beemployed in combinatorial biology approaches to synthesize a library ofmacrolide compounds having various polyketide and deoxysugar structures.Moreover, the expression of a glycosylase in a host cell whichsynthesizes a macrolide antibiotic may be useful in a method to reducetoxicity of, e.g., inactivate, the antibiotic. For example, a host cellwhich produces the antibiotic is transformed with an expression cassetteencoding the glycosyltransferase. The recombinant glycosyltransferase isexpressed in an amount that reversibly inactivates the antibiotic. Toactivate the antibiotic, the antibiotic, preferably the isolatedantibiotic which is recovered from the host cell, is contacted with anappropriate native or recombinant glycosidase.

[0014] Preferably, the nucleic acid segment encoding desosamine in theexpression cassette of the invention is not derived form the eryC genecluster of Saccharopolyspora erythraea. Preferred host cells areprokaryotic cells, although eukaryotic host cells are also envisioned.These host cells are useful to express desosamine, analogs orderivatives thereof as well as individual polypeptides which can then beisolated from the host cell. Also provided is an expression cassette orhost cell comprising antisense sequences from at least a portion of thedesosamine biosynthetic gene cluster.

[0015] Another embodiment of the invention is a recombinant host cell,e.g., a bacterial cell, in which at least a portion of a nucleic acidsequence encoding desosamine in the host chromosome is disrupted, e.g.,deleted or interrupted (e.g., by an insertion) with heterologoussequences, or substituted with a variant nucleic acid sequence of theinvention, so as to alter, preferably so as to result in a decrease orlack of, desosamine synthesis and/or so as to result in the synthesis ofan analog or derivative of desosamine. Preferably, the nucleic acidsequence which is disrupted is not derived from the eryC gene cluster ofSaccharopolyspora erythraea. Thus, the recombinant host cell of theinvention has at least one gene, i.e., desI, desII, desIII, desIV, desVdesVI, desVII, desVIII or desR, which is disrupted. One embodiment ofthe invention includes a recombinant host cell in which the desVI gene,which encodes an N-methyltransferase, is disrupted, for example, byreplacement with an antibiotic resistance gene. Preferably, such a hostcell produces an aglycone having an N-acetylated aminodeoxy sugar,10-deoxy-methylonide, a compound of formula (7), a compound of formula(8), or a combination thereof. Thus, the deletion or disruption of thedesVI gene may be useful in a method for preparing novel sugars.

[0016] Another preferred embodiment of the invention is a recombinantbacterial host cell in which the desR gene, which encodes a glycosidasesuch as β-glucosidase, is disrupted. Preferably, the host cellsynthesizes C-2′β-glucosylated macrolide antibiotics, for example, acompound of formula (13), a compound of formula (14), or a combinationthereof. Therefore, the invention further provides a compound of formula(8), (9), (13) or (14). It will be appreciated by those skilled in theart that each atom of the compounds of the invention having a chiralcenter may exist in and be isolated in optically active and racemicforms. Some compounds may exhibit polymorphism. It is to be understoodthat the present invention encompasses any racemic, optically active,polymorphic or stereoisomeric form, or mixtures thereof, of a compoundof the invention, which possess the useful properties described herein,it being well known in the art how to prepare optically active forms(for example, by resolution of the racemic form by recrystallizationtechniques, by synthesis from optically active starting materials, bychiral synthesis, or by chromatographic separation using a chiralstationary phase) and how to determine activity using the standard testsdescribed herein, or using other similar tests which are well known inthe art.

[0017] Also provided is a method for directing the biosynthesis ofspecific glycosylation-modified polyketides by genetic manipulation of apolyketide-producing microorganism. The method comprises introducinginto a polyketide-producing microorganism a DNA sequence encodingenzymes in desosamine biosynthesis, e.g., a DNA sequence comprising SEQID NO:3, a variant or fragment thereof, so as to yield a microorganismthat produces specific glycosylation-modified polyketides.Alternatively, an anti-sense DNA sequence of the invention may beemployed. Then the glycosylation-modified polyketides are isolated fromthe microorganism. It is preferred that the DNA sequence is modified soas to result in the inactivation of at least one enzymatic activity insugar biosynthesis or in the attachment of the sugar to a polyketide.

[0018] Further provided is an isolated and purified nucleic acid segmentcomprising a nucleic acid sequence comprising a macrolide biosyntheticgene cluster (the “met/pik” or “pik” gene cluster) encoding polypeptidesthat synthesize methymycin, pikromycin, neomethymycin, narbomycin, or acombination thereof, or a biologically active variant or fragmentthereof. It is preferred that the nucleic acid segment comprises SEQ IDNO:5, or a fragment or variant thereof, or hybridizes under moderate ormore preferably stringent, conditions to SEQ ID NO:5 or a fragmentthereof. It is also preferred that the isolated and purified nucleicacid segment is from Streptomyces sp., such as Streptomyces venezuelae(e.g., ATCC 15439, ATCC 15068, MCRL 0306, SC 2366 or 3629), Streptomycesnarbonensis (e.g., ATCC 19790), Streptomyces eurocidicus, Streptomyceszaomyceticus (MCRL 0405), Streptomyces flavochromogens, Streptomyces sp.AM400, and Streptomyces felleus, although isolated and purified nucleicacid from other organisms which produce methymycin, narbomycin,neomethymycin and/or pikromycin are also within the scope of theinvention. The cloned genes can be introduced into an expression systemand genetically manipulated so as to yield novel macrolide antibiotics,e.g., ketolides, as well as monomers for polyhydroxyalkanoate (PHA)biopolymers. Preferably, the nucleic acid sequence encodes PikR1 (e.g.,SEQ ID NO:27 encoded by SEQ ID NO:26), PikR2 (e.g., SEQ ID NO:29 encodedby SEQ ID NO:28), PikAI (e.g., SEQ ID NO:31 encoded by SEQ ID NO:30),PikAII (e.g., SEQ ID NO:33 encoded by SEQ ID NO:32), PikARII (e.g., SEQID NO:35 encoded by SEQ ID NO:34), PikAIV (e.g., SEQ ID NO:37 encoded bySEQ ID NO:36), PikB (which is the desosamine gene cluster describedabove), PikC (e.g., SEQ ID NO:39 encoded by SEQ ID NO:38), and PikD(e.g., SEQ ID NO:41 encoded by SEQ ID NO:40), a variant or a fragmentthereof, or hybridizes under moderate or preferably stringent conditionsto such a nucleic acid sequence.

[0019] The invention also provides a variant polypeptide having at leastabout 80%, more preferably at least about 90%, and even more preferablyat least about 95%, but less than 100%, contiguous amino acid sequenceidentity to the polypeptide having an amino acid sequence comprising SEQID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ IDNO:37, SEQ ID NO:39, SEQ ID NO:41, or a fragment thereof. A preferredvariant polypeptide, or a subunit or fragment of a polypeptide, of theinvention includes a variant or subunit polypeptide having at leastabout 1%, more preferably at least about 10%, and even more preferablyat least about 50%, the activity of the polypeptide having the aminoacid sequence comprising SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, or SEQ ID NO:41. Theactivities of polypeptides of the macrolide biosynthetic pathway of theinvention are described below.

[0020] A variant nucleic acid sequence of the pik biosynthetic genecluster of the invention has at least about 80%, more preferably atleast about 90%, and even more preferably at least about 95%, but lessthan 100%, contiguous nucleic acid sequence identity to a nucleic acidsequence comprising SEQ ID NO:5, SEQ ID NO:26, SEQ ID NO:28, SEQ IDNO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ IDNO:40, or a fragment thereof.

[0021] The pikA gene encodes a polyketide synthase which synthesizesmacrolactone 10-deoxymethonolide and narbolide, pikB encodes desosaminesynthases which catalyze the formation and transfer of a deoxysugarmoiety onto aglycones, the pikC gene encodes a P450 hydoxylase whichcatalyzes the conversion of YC-17 and narbomycin into methymycin,neomethymycin, and pikromycin, and the pikR 1, pikR2 (possibly one for a12-membered ring and the other for a 14-membered ring) and desR geneswhich encode enzymes associated with bacterial self-protection. Thus,the isolated nucleic acid molecule of the invention encodes four activemacrolide antibiotics two of which have a 12-membered ring while theother two have a 14-membered ring. The genetic mechanism underlying thealternative tennination of polyketide synthesis may be useful to preparenovel compounds, e.g., antibiotics, and PHA monomers. The inventionfurther provides isolated and purified nucleic acid segments, e.g., inthe form of an expression cassette, for each of the individual genes inthe macrolide biosynthetic gene cluster. For example, the inventionprovides an isolated and purified pikAV gene that encodes a thioesteraseII. In particular, the thioesterase may be useful to enhance thestructural diversity of antibiotics and in PHA production, as thethioesterase modulates chain release and cyclization. For example, athioesterase II gene having acyl-ACP coenzyme A transferase activity(e.g., a mutant pik TEII, bacterial, fungal or plant medium-chain-lengththioesterase, an animal fatty acid thioesterase or a thioesterase from apolyketide synthase) is introduced at the end of a recombinant monomersynthase (see FIG. 36), which, in the presence of a PHA synthase, e.g.,phaC1, produces a novel polyhydroxyalkanoate polymer. Alternatively, inthe absence of a TEII domain, a fusion of a portion of PKS gene clusterwith a PHA synthase may result in the transfer of an acyl chain from thePHA to the polymerase.

[0022] Also provided is a pikC gene that encodes a hydroxylase which isactive at two positions on a 12-membered ring or at one position on a14-membered ring. Such a gene may be particularly useful to preparenovel compounds through bioconversion or biotransformation.

[0023] The invention also provides an expression cassette comprising anucleic acid segment comprising a macrolide biosynthetic gene clusterencoding polypeptides that synthesize methymycin, pikromycin,neomethymycin, narbomycin, or a combination thereof, or a biologicallyactive variant or fragment thereof, operably linked to a promoterfunctional in a host cell. Further provided is a host cell comprisingthe nucleic acid segment encoding methymycin, pikromycin, neomethymycin,narbomycin, or a combination thereof, or a biologically active variantor fragment thereof Moreover, the invention provides isolated andpurified polypeptides of the invention, preferably obtained from hostcells having the nucleic acid molecules of the invention. In addition,expression cassettes and host cells comprising antisense sequences of atleast a portion of the macrolide biosynthetic gene cluster of theinvention are envisioned.

[0024] Yet another embodiment of the invention is a recombinant hostcell, e.g., a bacterial cell, in which a portion of the macrolidebiosynthetic gene cluster of the invention is disrupted or replaced witha heterologous sequence or a variant nucleic acid segment of theinvention, so as to alter, preferably so as to result in a decrease orlack of methymycin, pikromycin, neomethymycin, narbomycin, or acombination thereof, and/or so as to result in the synthesis of novelmacrolides. Therefore, the invention provides a recombinant host cell inwhich a pikAI gene, a pikAlI gene, a pikAIlI gene (12-membered rings),apikIV gene (14-membered rings), a pikB gene cluster, a pikAV gene, apikC gene, a pikD gene, a pikR1 gene, a pikR2 gene, or a combinationthereof, is disrupted or replaced. A preferred embodiment of theinvention is a host cell wherein the pikB (e.g., the desVI and desVgenes), pikA1, pikAV or pikC gene, is disrupted.

[0025] Although the sixth (final) condensation cycle is not required for10-deoxymethynolide formation, as described hereinbelow geneticdisruption of Pik module 6 (encoded by pikAIV) prevented production ofboth the 12- as well as the 14-membered ring macrolactones. Thus,expression of alternative forms of PikAIV controls the final step inpolyketide chain elongation and termination. Specifically, anN-termninal truncated form of PikAIV leads to 10-deoxymethynolideformation while full-length PikAIV results in narbonolide production.The expression of a truncated PKS module represents a novel method ofpolyketide chain length determination. Moreover, as the expression ofsuch a module may produce multiple polyketides, the use of such a modulemay result in the more rapid identification of novel products.

[0026] The invention also provides a method for combinatorialbiosynthesis. The method comprises expressing in a host cell anexpression cassette comprising a DNA fragment of a biosynthetic genecluster, e.g., a polyketide synthase gene wherein the expressioncassette is present on a plasmid, wherein the genome of the host cellcomprises a portion of the gene which is different than the portion ofthe gene present on the plasmid. Preferably, the DNA fragment and theportion of the gene which is one the host chromosome together comprisethe entire gene. Synchronized expression of genes from the plasmid andthe chromosome thus creates a combinatorial pathway that produces aproduct. The smaller size of the plasmid facilitates gene manipulationso that a large library of recombinant pathways can thus be generated ina short time. Preferably, the DNA fragment and the portion of the genecluster on the host chromosome are linked to the native promoter, e.g.,pik genes are linked to PpikA.

[0027] Moreover, as the nucleic acid segment comprising the macrolidebiosynthetic gene cluster of the invention encodes a polyketidesynthase, modules of that synthase are useful in methods to preparerecombinant polyhydroxyalkanoate monomer synthases and polymers inaddition to macrolide antibiotics and derivatives thereof.

[0028] Thus, the invention provides an isolated and purified DNAmolecule comprising a first DNA segment encoding a first module and asecond DNA segment encoding a second module, wherein the DNA segmentstogether encode a recombinant polyhydroxyalkanoate monomer synthase, andwherein at least one DNA segment is derived from the pikA gene clusterof Streptomyces venezuelae. Preferably, no more than one DNA segment isderived from the eryA gene cluster of Saccharopolyspora erythraea. Inone embodiment of the invention, the 3′ most DNA segment of the isolatedDNA molecule of the invention encodes a thioesterase II. Also providedis an expression cassette comprising a nucleic acid molecule encodingthe polyhydroxyalkanoate monomer synthase operably linked to a promoterfumctional in a host cell.

[0029] Yet another embodiment of the invention is a method of providinga polyhydroxyalkanoate monomer. The method comprises introducing into ahost cell a DNA molecule comprising a DNA segment encoding a recombinantpolyhydroxyalkanoate monomer synthase operably linked to a promoterfunctional in the host cell. The DNA molecule comprises a plurality ofDNA segments, e.g., a first module and a second module, wherein at leastone DNA segment is derived from the pikA gene cluster of Streptomycesvenezuelae. The DNA encoding the recombinant polyhydroxyalkanoatemonomer synthase is then expressed in the host cell so as to generate apolyhydroxyalkanoate monomer. Optionally, a second DNA molecule may beintroduced into the host cell. The second DNA molecule comprises a DNAsegment encoding a polyhydroxyalkanoate synthase operably linked to apromoter functional in the host cell. The two DNA molecules areexpressed in the host cell so as to generate a polyhydroxyalkanoatepolymer.

[0030] Another embodiment of the invention is an isolated and purifiedDNA molecule comprising a first DNA segment encoding a fatty acidsynthase and a second DNA segment encoding a module from the pikA genecluster of Streptomyces venezuelae. Such a DNA molecule can be employedin a method of providing a polyhydroxyalkanoate monomer. Thus, a DNAmolecule comprising a first DNA segment encoding a fatty acid synthaseand a second DNA segment encoding a polyketide synthase is introducedinto a host cell. The first DNA segment is 5′ to the second DNA segmentand the first DNA segment is operably linked to a promoter functional inthe host cell. The first DNA segment is linked to the second DNA segmentso that the linked DNA segments express a fusion protein. The DNAmolecule is expressed in the host cell so as to generate apolyhydroxyalkanoate monomer.

[0031] Further provided is a method of providing a polyhydroxyalkanoatemonomer synthase. The method comprises introducing an expressioncassette comprising a DNA molecule encoding a polyhydroxyalkanoatesynthase operably linked to a promoter functional in a host cell. TheDNA molecule comprises a first DNA segment encoding a first module and asecond DNA segment encoding a second module wherein the DNA segmentstogether encode a polyhydroxyalkanoate monomer synthase. At least oneDNA segment is derived from the pikA gene cluster of Streptomycesvenezuelae. The DNA molecule is expressed in the host cell. Optionally,the DNA molecule further comprises a DNA segment encoding apolyhydroxyalkanoate synthase. Alternatively, a second, separate DNAmolecule encoding a polyhydroxyalkanoate synthase is introduced into thehost cell.

[0032] A further embodiment of the invention is an isolated and purifiedDNA molecule comprising a DNA segment which encodes a Streptomycesvenezuelae polyketide synthase, e.g., a polyhydroxyalkanoate monomersynthase, a biologically active variant or subunit (fragment) thereof.Preferably, the DNA segment encodes a polypeptide having an amino acidsequence comprising SEQ ID NO:2. Preferably, the DNA segment comprisesSEQ ID NO:1. The DNA molecules of the invention are double stranded orsingle stranded. A preferred embodiment of the invention is a DNAmolecule that has at least about 70%, more preferably at least about80%, and even more preferably at least about 90%, but less than 100%,contiguous sequence identity to the DNA segment comprising SEQ ID NO:1,e.g., a “variant” DNA molecule. A variant DNA molecule of the inventioncan be prepared by methods well known to the art, includingoligonucleotide-mediated mutagenesis. See Adelman et al., DNA, 2, 183(1983) and Sambrook et al., Molecular Cloning. A Laboratory Manual(1989).

[0033] The invention also provides an isolated, purifiedpolyhydroxyalkanoate monomer synthase, e.g., a polypeptide having anamino acid sequence comprising SEQ ID NO:2, a biologically activesubunit, or a biologically active variant thereof. Thus, the inventionprovides a variant polypeptide having at least about 80%, morepreferably at least about 90%, and even more preferably at least about95%, but less than 100%, contiguous amino acid sequence identity to thepolypeptide having an amino acid sequence comprising SEQ ID NO:2. Apreferred variant polypeptide, or a subunit of a polypeptide, of theinvention includes a variant or subunit polypeptide having at leastabout 10%, more preferably at least about 50%, and even more preferablyat least about 90%, the activity of the polypeptide having the aminoacid sequence comprising SEQ ID NO:2. Preferably, a variant polypeptideof the invention has one or more conservative amino acid substitutionsrelative to the polypeptide having the amino acid sequence comprisingSEQ ID NO:2. For example, conservative substitutions includeaspartic-glutamic as acidic amino acids; lysine/arginine/histidine asbasic amino acids; leucine/isoleucine, methionine/valine, alanine/valineas hydrophobic amino acids; serine/glycine/alanine/threonine ashydrophilic amino acids. The biological activity of a polypeptide of theinvention can be measured by methods well known to the art, includingbut not limited to, methods described hereinbelow.

[0034] Thus, the modules encoded by the nucleic acid segments of theinvention may be employed in the methods described hereinabove toprepare polyhydroxyalkanoates of varied chain length or having variousside chain substitutions and/or to prepare glycosylated biopolymers.

[0035] The compounds produced by the recombinant host cells of theinvention are useful as biopolymers, e.g., in packaging or biomedicalapplications, to engineer PHA monomer synthases, or to preparebiologically active agents, such as those useful to prepare a medicamentfor the treatment of a pathological condition or a symptom in a mammal,e.g., a human. The agents include pharmaceuticals such aschemotherapeutic agents, immunosuppressants, agents to treat asthma,chronic obstructive pulmonary disease as well as other diseasesinvolving respiratory inflammation, cholesterol-lowering agents, ormacrolide-based antibiotics which are active against a variety oforganisms, e.g., bacteria, including multi-drug-resistant pneumococciand other respiratory pathogens, as well as viral and parasiticpathogens; or as crop protection agents (e.g., fungicides orinsecticides) via expression of polyketides in plants. Methods employingthese compounds, e.g., to treat a mammal, bird or fish in need of suchtherapy, such as a patient having a bacterial, viral or parasiticinfection, cancer, respiratory disease, or in need of immunosuppression,e.g., during cell, tissue or organ transplantation, are also envisioned.

BRIEF DESCRIPTION OF THE FIGURES

[0036]FIG. 1. The PHB biosynthetic pathway in A. eutrophus.

[0037]FIG. 2. Molecular structure of common bacterial PHAs. Most of theknown PHAs are polymers of 3-hydroxy acids possessing the generalformula shown. For example, R═CH₃ in PHB, T═CH₂CH₃ inpolyhydroxyvalerate (PHV), and R═(CH₂)₄CH₃ in polyhydroxyoctanoate(PHO).

[0038]FIG. 3. Comparison of the natural and recombinant pathways for PHBsynthesis. The three enzymatic steps of PHB synthesis in bacteriainvolving 3-ketothiolase, acetoacetyl-CoA reductase, and PHB synthaseare shown on the left. The two enzymatic steps involved in PHB synthesisin the pathway in Sf21 cells containing a rat fatty acid synthase withan inactivated dehydrase domain (ratFAS206) are shown on the right.

[0039]FIG. 4. Schematic diagram of the molecular organization of the tylpolyketide synthase (PKS) gene cluster. Open arrows correspond toindividual open reading frames (ORFs) and numbers above an ORF denote amultifunctional module or synthase unit (SU). AT=acyltransferase;ACP=acyl carrier protein; KS=β-ketoacyl synthase; KR=ketoreductase;DH=dehydrase; ER=enoyl reductase; TE=thioesterase; MM=methylmalonylCoA;M=malonyl CoA; EM=ethylmalonyl CoA. Module 7 in tyl is also known asModule F.

[0040]FIG. 5. Schematic diagram of the molecular organization of the metPKS gene cluster.

[0041]FIG. 6. Strategy for producing a recombinant PHA monomer synthaseby domain replacement.

[0042]FIG. 7. (A) 10% SDS-PAGE gel showing samples from various stagesof the purification of PHA synthase; lane 1, molecular weight markers;lane 2, total protein of uninfected insect cells; lane 3, total proteinor insect cells expressing a rat FAS (200 kDa; Joshi et al., Biochem.J., 296, 143 (1993)); lane 4, total protein of insect cells expressingPHA synthase; lane 5, soluble protein from sample in lane 4; lane 6,pooled hydroxylapatite (HA) fractions containing PHA synthase. (B)Western analysis of an identical gel using rabbit-α-PHA synthaseantibody as probe. Bands designated with arrows are: a, intact PHBsynthase with N-terminal alanine at residue 7 and serine at residue 10(A7/S10); b, 44 kDa fragment of PHB synthase with N-terminal alanine atresidue 181 and asparagine at residue 185 (A181/N185); c, PHB synthasefragment of approximately 30 kDa apparently blocked based on resistanceto Edman degradation; d, 22 kDa fragment with N-terminal glycine atresidue 187 (G 187). Band d apparently does not react with rabbit-α-PHBsynthase antibody (B, lane 6). The band of similar size in B, lane 4 wasnot further identified.

[0043]FIG. 8. N-terrninal analysis of PHA synthase purified from insectcells. (a) The expected N-terrninal 25 amino acid sequence of A.eutrophus PHA synthase. (b&c) The two N-terminal sequences determinedfor the A. eutrophus PHA synthase produced in insect cells. The boldedsequences are the actual N-termini determined.

[0044]FIG. 9. Spectrophotometric scans of substrate, 3-hydroxybutyrateCoA (HBCoA) and product, CoA. The wavelength at which the directspectrophotometric assays were carried out (232 nm) is denoted by thearrow; substrate, HBCoA () and product, CoA (◯).

[0045]FIG. 10. Velocity of the hydrolysis of HBCoA as a function ofsubstrate concentration. Assays were carried out in 40 or 200 μl assayvolumes with enzyme concentration remaining constant at 0.95 mg/ml (3.8μg/40 μl assay). Velocities were calculated from the linear portions ofthe assay curves subsequent to the characteristic lag period. Thesubstrate concentration at half-optimal velocity, the apparent K_(m)value, was estimated to be 2.5 mM from this data.

[0046]FIG. 11. Double reciprocal plot of velocity versus substrateconcentration. The concave upward shape of this plot is similar toresults obtained by Fukui et al. (Arch. Microbiol., 110, 149 (1976))with granular PHA synthase from Z. ramigera.

[0047]FIG. 12. Velocity of the hydrolysis of HBCoA as a function ofenzyme concentration. Assays were carried out in 40 μl assay volumeswith the concentration HBCoA remaining constant at 8 μM.

[0048]FIG. 13. Specific activity of PHA synthase as a function of enzymeconcentration.

[0049]FIG. 14. pH activity curve for soluble PHA synthase produced usingthe baculovirus system. Reactions were carried out in the presence of200 mM P_(i). Buffers of pH<10 were prepared with potassium phosphate,while buffers of pH>10 were prepared with the appropriate proportion ofNa₃PO₄.

[0050]FIG. 15. Assays of the hydrolysis of HBCoA with varying amounts ofPHA synthase. Assays were carried out in 40 μl assay volumes with theconcentration of HBCoA remaining constant at 8 μM. Initial A₂₃₂ values,originally between 0.62 and 0.77, were normalized to 0.70. Enzymeamounts used in these assays were, from the uppermost curve, 0.38, 0.76,1.14, 1.52, 1.90, 2.28, 2.66, 3.02, 3.42, 7.6, and 15.2 μg,respectively.

[0051]FIG. 16. SDS/PAGE analysis of proteins synthesized at various timepoints during infection of Sf21 cells. Approximately 0.5 mg of totalcellular protein from various samples was fractionated on a 10%polyacrylamide gel. Samples include: uninfected cells, lanes 1-4, days0, 1, 2, 3, respectively; infection with BacPAK6::phbC alone, lanes 5-8,days, 0, 1, 2, 3, respectively, infection with baculoviral clonecontaining ratFAS206 alone, lanes 9-12, days 0, 1, 2, 3, respectively;and ratFAS206 and BacPAK6 infected cells, lanes 13-16, days 0, 1, 2, 3,respectively. A=mobility of FAS, B=mobility of PHA synthase. Molecularweight standard lanes are marked M.

[0052]FIG. 17. Gas chromatographic evidence for PHB accumulation in Sf21cells. Gas chromatograms from various samples are superimposed. PHBstandard (Sigma) is chromatogram #7 showing a propylhydroxybutyrateelution time of 10.043 minutes (s, arrow). The gas chromatograms ofextracts of the uninfected (#1); singly infected with ratFAS206 (#2, day3); and singly infected with PHA synthase (#3, day 3) are shown at thebottom of the figure. Gas chromatograms of extracts of dual-infectedcells at day 1 (#4), 2 (#5), and 3 (#6) are also shown exhibiting a peakeluting at 10.096 minutes (x, arrow). The peak of dual-infected, day 3extract (#6) was used for mass spectrometry (MS) analysis.

[0053]FIG. 18. Gas chromatography-mass spectrometry analysis of PHB. Thecharacteristic fragmentation of propylhydroxybutyrate at m/z of 43, 60,87, and 131 is shown. A) standard PHB from bacteria (Sigma), and B) peakX from ratFAS206 and BacPAK6: phbC baculovirus infected, day 3 (#6, FIG.17) Sf21 cells expressing rat FAS dehydrase inactivated protein and PHAsynthase.

[0054]FIG. 19. Map of the vep (Streptomyces venezuelae polyene encoding)gene cluster.

[0055]FIG. 20. Plasmid map of pDHS502.

[0056]FIG. 21. Plasmid map of pDHS505.

[0057]FIG. 22. Cloning protocol for pDHS505.

[0058]FIG. 23. Nucleotide sequence (SEQ ID NO:1) and corresponding aminoacid sequence (SEQ ID NO:22) of vep ORFI.

[0059]FIG. 24. Schematic diagram of the desosamine biosynthetic pathwayand the enzymatic activity associated with each of the desosaminebiosynthetic polypeptides.

[0060]FIG. 25. Schematic of the conversion of the inactive(diglycosylated) form of methymycin and pikromycin to the active form ofmethymycin and pikromycin.

[0061]FIG. 26. Schematic diagram of the desosamine biosynthetic pathway.

[0062]FIG. 27. Pathway for the synthesis of a compound of formula 7 and8 in desVI mutants of Streptomyces.

[0063]FIG. 28. Structure and biosynthesis of methymycin, pikromycin, andrelated compounds in Streptomyces venezuelae ATCC 15439. Methymycin:R₁═OH, R₂═H, neomethymycin: R₁═H, R₂ ═OH; pikromycin: R₃═OH, narbomycin:R₃ ═H. Polyketide synthase components PikAI, PikAII, PikAIII, PikAIV,and PikAV are represented by solid bars. Each circle represents anenzymatic domain in the Pik PKS system. KS: β-ketoacyl-ACP synthase, AT:acyltransferase, ACP: acyl carrier protein, KR: β-ketoacyl-ACPreductase, DH: β-hydroxyl-thioester dehydratase, ER: enoyl reductase,KS^(Q): a KS-like domain, KR with a cross: nonfunctional KR, TE:thioesterase domain, and TEII: type II thioesterase. Des represents alleight enzymes for desosamine biosynthesis and transfer and PikC is thecytochrome P450 monooxygenase responsible for hydroxylation at R₁, R₂,and R₃ positions (Xu et al., 1998).

[0064]FIG. 29. Organization of the pik cluster in S. venezuelae. Eacharrow represents an open reading frame (ORF). The direction oftranscription and relative sizes of the ORFs deduced from nucleotidesequence are indicated. The cluster is composed of four genetic loci:pikA, pikB (des), pikC, and pikR. Cosmid clones are denoted asoverlapping lines.

[0065]FIG. 30. Conversion of YC-17 and narbomycin by PikC P450hydroxylase.

[0066]FIG. 31. Nucleotide sequence (SEQ ID NO:5) and inferred amino acidsequence (SEQ ID NO:6) of the pik gene cluster.

[0067]FIG. 32. Nucleotide sequence (SEQ ID NO:3) and inferred amino acidsequence (SEQ ID NO:4) of the desosamine gene cluster.

[0068]FIG. 33. S. venezuelae AX916 construct useful to prepare apolyketide having a shorter chain length compared to wild-type pikA. pikmodule 2 is fused to pik module 5, and module 3 and 4 are deleted, so asto encode a three module PKS which produces two macrolides, a triketideand a tetraketide.

[0069]FIG. 34. Recombinant PKS having a wild-type thioesterase II.

[0070]FIG. 35. pAX703 construct, an expression and complementationvector. The PikTEII gene can be replaced with an EcoRI-NsiI fragment.The phaC1 gene can be replaced with a PacI-DraI fragment.

[0071]FIG. 36. Strategy for C7 polymer production. mTEII is a mutantpikTEII, an acyl-ACP CoA transferase; phaC1 is a PHA polymerase 1 fromP. olivarus which may have racemase activity. In a strain having theseconstructs, AX916, a PHA polymer is produced.

[0072]FIG. 37. Strategy for C5 polymer production. A PHA polymerase genephaC1 is directly fused to pik module 2, so as to result in a fusionthat transfers an acyl chain from the PKS protein directly to thepolymerase by the prosthetic group on the ACP domain of the PKS.

[0073]FIG. 38. Codons for specified amino acids.

[0074]FIG. 39. Exemplary and preferred amino acid substitutions.

[0075]FIG. 40. Plasmid complementation of S. venezuelae AX912. Therelevant genotype (on the chromosome and on the plasmid) is listed onthe left side and the corresponding phenotype is listed on the rightside. The pikA genes are indicated by open arrows with divided boxesindicating domains in the PKS. An internal alternative translation startsite for PikAIV is indicated by an * above the KS₆ domain and ahexa-histidine was introduced into mutant AX912 chromosome positionmarked by a) to facilitate the detection of PikAIV expression.Antibiotic production was determined following complementation of mutantAX912 with the corresponding plasmids. Antibiotic production wasnormalized by using AX912 as 0% and full-length pikAIV complementation(pDHS707) as 100% standards.

[0076]FIG. 41. Mechanistic models for alternative termination by PikAIV.Proteins PikAIII and PikAIV are stacked one on top of the otheraccording to their order in polyketide biosynthesis (PikAI and PikAIIare not shown). A sphere represents an enzymatic domain in the PKSs withits diameter proportional to the size of the domain. Each PKSmodule/protein was first dimerized (each peptide chain is shown aseither red or blue) and then twisted 180 degrees to form a half helixfollowing the model for erythromycin PKS (Staunton et al., 1999). Twosets of independent active sites are thus formed along two grooves ofthe helix that lead to the production of two polyketides in eachbiosynthetic cycle. A) Wild type S. venezuelae under culture conditionsfor pikromycin production. B) Wild type S. venezuelae under cultureconditions for methymycin production. C) S. venezuelae AX912 (pDHS704)under culture conditions for methyrycin production. D) S. venezuelaeAX912 (pDHS704) under culture conditions for pikromycin production. E)S. venezuelae AX912 (pDHS708) under culture conditions for pikromycinproduction. F) S. venezuelae AX912 (pDHS708) under culture conditionsfor methymycin production. Gene products expressed from the plasmidconstruct used for complementation are underlined.

[0077]FIG. 42. Pathway for desosamine biosynthesis.

[0078]FIG. 43. Schematic of pathway leading to methymycin/neomethymycinanalogs 18 and 19.

[0079]FIG. 44. Macrolide having D-quinovose.

[0080]FIG. 45. Products produced by desI mutant.

[0081]FIG. 46. Pik sequences from Streptomyces spp. A) PikA3-pikA4 fromS. venezulae ATCC 15068 (SEQ ID NO:54). B) PikA3-pikA4 from S.narbonesis ATCC 19790 (SEQ ID NO:55). C) TEII gene from S. venezulaeATCC 15068 (SEQ ID NO:56). D) TEII gene from S. narbonesis ATCC 19790(SEQ ID NO:57).

DETAILED DESCRIPTION OF THE INVENTION

[0082] Definitions

[0083] As used herein, a “linker region” is an amino acid sequencepresent in a multifunctional protein which is less well conserved in anamino acid sequence than an amino acid sequence with catalytic activity.

[0084] As used herein, an “extender unit” catalytic or enzymatic domainis an acyl transferase in a module that catalyzes chain elongation byadding 2-4 carbon units to an acyl chain and is located carboxy-terminalto another acyl transferase. For example, an extender unit withmethyhnalonylCoA specificity adds acyl groups to a methylmalonylCoAmolecule.

[0085] As used herein, a “polyhydroxyalkanoate” or “PHA“polymerincludes, but is not limited to, linked units of related, preferablyheterologous, hydroxyalkanoates such as 3-hydroxybutyrate,3-hydroxyvalerate, 3-hydroxycaproate, 3-hydroxyheptanoate,3-hydroxyhexanoate, 3-hydroxyoctanoate, 3-hydroxyundecanoate, and3-hydroxydodecanoate, and their 4-hydroxy and 5-hydroxy counterparts.

[0086] As used herein, a “Type I polyketide synthase” is a singlepolypeptide with a single set of iteratively used active sites. This isin contrast to a Type II polyketide synthase which employs active siteson a series of polypeptides.

[0087] As used herein, a “recombinant” nucleic acid or protein moleculeis a molecule where the nucleic acid molecule which encodes the proteinhas been modified in vitro, so that its sequence is not naturallyoccurring, or corresponds to naturally occurring sequences that are notpositioned as they would be positioned in a genome which has not beenmodified.

[0088] A “recombinant” host cell of the invention has a genome that hasbeen manipulated in vitro so as to alter, e.g., decrease or disrupt, or,alternatively, increase, the function or activity of at least one genein the macrolide or desosamine biosynthetic gene cluster of theinvention.

[0089] As used herein, a “multiflnctional protein” is one where two ormore enzymatic activities are present on a single polypeptide.

[0090] As used herein, a “module” is one of a series of repeated unitsin a multifunctional protein, such as a Type I polyketide synthase or afatty acid synthase.

[0091] As used herein, a “premature termination product” is a productwhich is produced by a recombinant multifunctional protein which isdifferent than the product produced by the non-recombinantmultifunctional protein. In general, the product produced by therecombinant multifunctional protein has fewer acyl groups.

[0092] As used herein, a DNA that is “derived from” a gene cluster is aDNA that has been isolated and purified in vitro from genomic DNA, orsynthetically prepared on the basis of the sequence of genomic DNA.

[0093] As used herein, the “pik” or “pik/met” gene cluster includessequences encoding a polyketide synthase (pikA), desosamine biosyntheticenzymes (pikB, also referred to as des), a cytochrome P450 (pikC),regulatory factors (pikD) and enzymes for cellular self-resistance(pikR).

[0094] As used herein, the terms “isolated and/or purified” refer to invitro isolation of a DNA or polypeptide molecule from its naturalcellular environment, and from association with other components of thecell, such as nucleic acid or polypeptide, so that is can be sequenced,replicated and/or expressed. Moreover, the DNA may encode more than onerecombinant Type I polyketide synthase and/or fatty acid synthase. Forexample, “an isolated DNA molecule encoding a polyhydroxyalkanoatemonomer synthase” is RNA or DNA containing greater than 7, preferably15, and more preferably 20 or more sequential nucleotide bases thatencode a biologically active polypeptide, fragment, or variant thereof,that is complementary to the non-coding, or complementary to the codingstrand, of a polyhydroxyalkanoate monomer synthase RNA, or hybridizes tothe RNA or DNA encoding the polyhydroxyalkanoate monomer synthase andremains stably bound under stringent conditions, as defined by methodswell known to the art, e.g., in Sambrook et al., supra.

[0095] An “antibiotic” as used herein is a substance produced by amicroorganism which, either naturally or with limited chemicalmodification, will inhibit the growth of or kill another microorganismor eukaryotic cell.

[0096] An “antibiotic biosynthetic gene” is a nucleic acid, e.g., DNA,segment or sequence that encodes an enzymatic activity which isnecessary for an enzymatic reaction in the process of converting primarymetabolites into antibiotics.

[0097] An “antibiotic biosynthetic pathway” includes the entire set ofantibiotic biosynthetic genes necessary for the process of convertingprimary metabolites into antibiotics. These genes can be isolated bymethods well known to the art, e.g., see U.S. Pat. No. 4,935,340.

[0098] Antibiotic-producing organisms include any organism, including,but not limited to, Actinoplanes, Actinomadura, Bacillus,Cephalosporium, Micromonospora, Penicillium, Nocardia, and Streptomyces,which either produces an antibiotic or contains genes which, ifexpressed, would produce an antibiotic.

[0099] An antibiotic resistance-conferring gene is a DNA segment thatencodes an enzymatic or other activity which confers resistance to anantibiotic.

[0100] The term “polyketide” as used herein refers to a large anddiverse class of natural products, including but not limited toantibiotic, antifungal, anticancer, and anti-helminthic compounds.Antibiotics include, but are not limited to anthracyclines andmacrolides of different types (polyenes and avermectins as well asclassical macrolides such as erythromycins). Macrolides are produced by,for example, S. erytheus, S. antibioticus, S. venezuelae, S. fradiae andS. narbonensis.

[0101] The term “glycosylated polyketide” refers to any polyketide thatcontains one or more sugar residues.

[0102] The term “glycosylation-modified polyketide” refers to apolyketide having a changed glycosylation pattern or configurationrelative to that particular polyketide's unmodified or native state.

[0103] The term “polyketide-producing microorganism” as used hereinincludes any microorganism that can produce a polyketide naturally orafter being suitably engineered (i.e., genetically). Examples ofactinomycetes that naturally produce polyketides include but are notlimited to Micromonospora rosaria, Micromonospora megalomicea,Saccharopolyspora erythraea, Streptomyces antibioticus,, Streptomycesalbereticuli, Streptomyces ambofaciens, Streptomyces avermitilis,Streptomyces fradiae, Streptomyces griseus, Streptomyces hydroscopicus,Streptomyces tsukulubaensis, Streptomyces mycarofasciens, Streptomycesplatenesis, Streptomyces violaceoniger, Streptomyces violaceoniger,Streptomyces thermotolerans, Streptomyces rimosus, Streptomycespeucetius, Streptomyces coelicolor, Streptomyces glaucescens,Streptomyces roseofulvus, Streptomyces cinnamonensis, Streptomycescuracoi, and Amycolatopsis mediterranei (see Hopwood, D. A. and Sherman,D. H., Annu. Rev. Genet., 24:37-66 (1990), incorporated herein byreference). Other examples of polyketide-producing microorganisms thatproduce polyketides naturally include various Actinomadura,Dactylosporangium and Nocardia strains.

[0104] The term “sugar biosynthesis genes” as used herein refers tonucleic acid sequences from organisms such as Streptomyces venezuelaethat encode sugar biosynthesis enzymes and is intended to includesequences of DNA from other polyketide-producing microorganisms whichare identical or analogous to those obtained from Streptomycesvenezuelae.

[0105] The term “sugar biosynthesis enzymes” as used herein refers topolypeptides which are involved in the biosynthesis and/or attachment ofpolyketide-associated sugars and their derivatives and intermediates.

[0106] The term “polyketide-associated sugar” refers to a sugar that isknown to attach to polyketides or that can be attached to polyketides bythe processes described herein.

[0107] The term “sugar derivative” refers to a sugar which is naturallyassociated with a polyketide but which is altered relative to theunmodified or native state, including but not limited to,N-3-α-desdimethyl D-desosamine.

[0108] The term “sugar intermediate” refers to an intermediate compoundproduced in a sugar biosynthesis pathway.

[0109] As used herein, the term “derivative” means that a particularcompound produced by a host cell of the invention or prepared in vitrousing polypeptides encoded by the nucleic acid molecules of theinvention, is modified so that it comprises other moieties, e.g.,peptide or polypeptide molecules, such as antibodies or fragmentsthereof, nucleic acid molecules, sugars, lipids, fats, a detectablesignal molecule such as a radioisotope, e.g., gamma emitters, smallchemicals, metals, salts, synthetic polymers, e.g., polylactide andpolyglycolide, surfactants and glycosaminoglycans, which are covalentlyor non-covalently attached or linked to the compound.

[0110] A “recombinant” host cell of the invention has a genome that hasbeen manipulated in vitro so as to alter, e.g., decrease or disrupt, oralternatively, increase, the function or activity of at least one gene,e.g., in the pik biosynthetic gene cluster, of the invention.

[0111] As used herein, the term “derivative” means that a particularcompound produced by a host cell of the invention or prepared in vitrousing polypeptides encoded by the nucleic acid molecules of theinvention, is modified so that it comprises other moieties, e.g.,peptide or polypeptide molecules, such as antibodies or fragmentsthereof, nucleic acid molecules, sugars, lipids, fats, a detectablesignal molecule such as a radioisotope, e.g., gamma emitters, smallchemicals, metals, salts, synthetic polymers, e.g., polylactide andpolyglycolide, surfactants and glycosaminoglycans, which are covalentlyor non-covalently attached or linked to the compound.

[0112] It will be appreciated by those skilled in the art that each atomof the compounds of the invention having a chiral center may exist inand be isolated in optically active and racemic forms. Some compoundsmay exhibit polymorphism. It is to be understood that the presentinvention encompasses any racemic, optically active, polymorphic orstereoisomeric form, or mixtures thereof, of a compound of theinvention, which possess the useful properties described herein, itbeing well known in the art how to prepare optically active forms (forexample, by resolution of the racemic form by recrystallizationtechniques, by synthesis from optically active starting materials, bychiral synthesis, or by chromatographic separation using a chiralstationary phase) and how to determine activity using the standard testsdescribed herein, or using other similar tests which are well known inthe art.

[0113] The term “sequence homology” or “sequence identity” means theproportion of base matches between two nucleic acid sequences or theproportion amino acid matches between two amino acid sequences. Whensequence homology is expressed as a percentage, e.g., 50%, thepercentage denotes the proportion of matches over the length of sequencethat is compared to some other sequence. Gaps (in either of the twosequences) are permitted to maximize matching; gap lengths of 15 basesor less are usually used, 6 bases or less are preferred with 2 bases orless more preferred. When using oligonucleotides as probes, the sequencehomology between the target nucleic acid and the oligonucleotidesequence is generally not less than 17 target base matches out of 20possible oligonucleotide base pair matches (85%); preferably not lessthan 9 matches out of 10 possible base pair matches (90%), and morepreferably not less than 19 matches out of 20 possible base pair matches(95%).

[0114] Two amino acid sequences are homologous if there is a partial orcomplete identity between their sequences. For example, 85% homologymeans that 85% of the amino acids are identical when the two sequencesare aligned for maximum matching. Gaps (in either of the two sequencesbeing matched) are allowed in maximizing matching; gap lengths of 5 orless are preferred with 2 or less being more preferred. Alternativelyand preferably, two protein sequences (or polypeptide sequences derivedfrom them of at least 30 amino acids in length) are homologous, as thisterm is used herein, if they have an alignment score of at more than 5(in standard deviation units) using the program ALIGN with the mutationdata matrix and a gap penalty of 6 or greater. See Dayhoff, M. O., inAtlas of Protein Sequence and Structure, 1972, volume 5, NationalBiomedical Research Foundation, pp. 101-110, and Supplement 2 to thisvolume, pp. 1-10. The two sequences or parts thereof are more preferablyhomologous if their amino acids are greater than or equal to 50%identical when optimally aligned using the ALIGN program.

[0115] The following terms are used to describe the sequencerelationships between two or more polynucleotides: “reference sequence”,“comparison window”, “sequence identity”, “percentage of sequenceidentity”, and “substantial identity”. A “reference sequence” is adefined sequence used as a basis for a sequence comparison; a referencesequence may be a subset of a larger sequence, for example, as a segmentof a full-length cDNA or gene sequence given in a sequence listing, ormay comprise a complete CDNA or gene sequence. Generally, a referencesequence is at least 20 nucleotides in length, frequently at least 25nucleotides in length, and often at least 50 nucleotides in length.Since two polynucleotides may each (1) comprise a sequence (i.e., aportion of the complete polynucleotide sequence) that is similar betweenthe two polynucleotides, and (2) may further comprise a sequence that isdivergent between the two polynucleotides, sequence comparisons betweentwo (or more) polynucleotides are typically performed by comparingsequences of the two polynucleotides over a “comparison window” toidentify and compare local regions of sequence similarity.

[0116] A “comparison window”, as used herein, refers to a conceptualsegment of at least 20 contiguous nucleotides and wherein the portion ofthe polynucleotide sequence in the comparison window may compriseadditions or deletions (i.e., gaps) of 20 percent or less as compared tothe reference sequence (which does not comprise additions or deletions)for optimal alignment of the two sequences. Optimal alignment ofsequences for aligning a comparison window may be conducted by the localhomology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2: 482,by the homology alignment algorithm of Needleman and Wunsch (1970) J.Mol. Biol. 48 443, by the search for similarity method of Pearson andLipman (1988) Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444, by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package Release 7.0, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.), or by inspection, and the bestalignment (i.e., resulting in the highest percentage of homology overthe comparison window) generated by the various methods is selected.

[0117] The term “sequence identity” means that two polynucleotidesequences are identical (i.e., on a nucleotide-by-nucleotide basis) overthe window of comparison. The term “percentage of sequence identity”means that two polynucleotide sequences are identical (i.e., on anucleotide-by-nucleotide basis) over the window of comparison. The term“percentage of sequence identity” is calculated by comparing twooptimally aligned sequences over the window of comparison, determiningthe number of positions at which the identical nucleic acid base (e.g.,A, T, C, G, U, or I) occurs in both sequences to yield the number ofmatched positions, dividing the number of matched positions by the totalnumber of positions in the window of comparison (i.e., the window size),and multiplying the result by 100 to yield the percentage of sequenceidentity. The terms “substantial identity” as used herein denote acharacteristic of a polynucleotide sequence, wherein the polynucleotidecomprises a sequence that has at least 85 percent sequence identity,preferably at least 90 to 95 percent sequence identity, more usually atleast 99 percent sequence identity as compared to a reference sequenceover a comparison window of at least 20 nucleotide positions, frequentlyover a window of at least 20-50 nucleotides, wherein the percentage ofsequence identity is calculated by comparing the reference sequence tothe polynucleotide sequence which may include deletions or additionswhich total 20 percent or less of the reference sequence over the windowof comparison.

[0118] As applied to polypeptides, the term “substantial identity” meansthat two peptide sequences, when optimally aligned, such as by theprograms GAP or BESTFIT using default gap weights, share at least about80 percent sequence identity, preferably at least about 90 percentsequence identity, more preferably at least about 95 percent sequenceidentity, and most preferably at least about 99 percent sequenceidentity.

[0119] In accordance with the present invention there is provided anisolated and purified nucleic acid molecule which encodes the entirepathway for methymycin, pikromycin, neomethymycin, narbomycin, or acombination thereof, which includes sugar biosynthetic genes that arelinked thereto. Desirably, the nucleic acid molecule is DNA isolatedfrom Streptomyces spp. The present invention further includes isolatedand purified nucleic acid sequences which hybridize under standard orstringent conditions to the nucleic acid molecules of the invention. Itis also understood that the invention encompasses isolated and purifiedpolypeptides which may be encoded by the nucleic acid molecules of theinvention.

[0120] The invention described herein can be used for the production ofa diverse range of novel compounds including polyketides, e.g.,antibiotics, and biodegradable PHA polymers through genetic redesign ofDNA encoding a FAS or a PKS such as that found in Streptomyces spp.Thus, the isolation and characterization of this gene cluster allows forthe selective production of antibiotics, the overproduction or underproduction of particular compounds, e.g., overproduction of certainantibiotics, and the production of novel compounds. For example,combinational biosynthetic-based modification of compounds may beaccomplished by selective activation or disruption of specific geneswithin the cluster or incorporation of the genes into biasedbiosynthetic libraries which are assayed for a wide range of biologicalactivities, to derive greater chemical diversity. A further exampleincludes the introduction of biosynthetic gene(s) into a particular hostcell so as to result in the production of a novel compound due to theactivity of the biosynthetic gene(s) on other metabolites, intermediatesor components of the host cells.

[0121] Further, different PHA synthases can be tested for their abilityto polymerize monomers produced by the recombinant PKS or PHA monomersynthase into a biodegradable polymer. The invention also provides amethod by which various PHA synthases can be tested for theirspecificity with respect to different monomer substrates.

[0122] The potential uses and applications of PHAs produced by PHAmonomer synthases and PHA synthases include both medical and industrialapplications. Medical applications of PHAs include surgical pins,sutures, staples, swabs, wound dressings, blood vessel replacements,bone replacements and plates, stimulation of bone growth bypiezoelectric properties, and biodegradable carrier for long-term dosageof pharmaceuticals. Industrial applications of PHAs include disposableitems such as baby diapers, packaging containers, bottles, wrappings,bags, and films, and biodegradable carriers for long-term dosage ofherbicides, fungicides, insecticides, or fertilizers.

[0123] In animals, the biosynthesis of fatty acids de novo frommalonyl-CoA is catalyzed by FAS. For example, the rat FAS is a homodimerwith a subunit structure consisting of 2505 amino acid residues having amolecular weight of 272,340 Da. Each subunit consists of seven catalyticactivities in separate physical domains (Amy et al., Proc. Natl. Acad.Sci. USA, 6, 3114 (1989)). The physical location of six of the catalyticactivities, ketoacyl synthase (KS), malonyl/acetyltransferase (M/AT),enoyl reductase (ER), ketoreductase (KR), acyl carrier protein (ACP),and thioesterase (TE), has been established by (1) the identification ofthe various active site residues within the overall amino acid sequenceby isolation of catalytically active fragments from limited proteolyticdigests of the whole FAS, (2) the identification of regions within theFAS that exhibit sequence similarity with various monofunctionalproteins, (3) expression of DNA encoding an amino acid sequence withcatalytic activity to produce recombinant proteins, and (4) theidentification of DNA that does not encode catalytic activity, i.e., DNAencoding a linker region. (Smith et al., Proc. Natl. Acad. Sci. USA, 73,1184 (1976); Tsukamoto et al., J. Biol. Chem., 263, 16225 (1988); Ranganet al., J. Biol. Chem., 226, 19180 (1991)).

[0124] The seventh catalytic activity, dehydrase (DH), was identified asphysically residing between AT and ER by an amino acid comparison of FASwith the amino acid sequences encoded by the three open reading framesof the eryA polyketide synthase (PKS) gene cluster of Saccharopolysporaerythraea. The three polypeptides that comprise this PKS are constructedfrom “modules” which resemble animal FAS, both in terms of their aminoacid sequence and in the ordering of the constituent domains (Donadio etal.,Gene, 111, 51 (1992); Benh et al., Eur. J. Biochem., 204, 39(1992)).

[0125] One embodiment of the invention employs a FAS in which the DH isinactivated (FAS DH-). The FAS DH- employed in this embodiment of theinvention is preferably a eukaryotic FAS DH- and, more preferably, amammalian FAS DH-. The most preferred embodiment of the invention is aFAS where the active site in the DH has been inactivated by mutation.For example, Joshi et al. (J. Biol. Chem., 268, 22508 (1993)) changedthe His⁸⁷⁸ residue in the rat FAS to an alanine residue by site-directedmutagenesis. In vitro studies showed that a FAS with this change(ratFAS206) produced 3-hydroxybutyrylCoA as a premature terminationproduct from acetyl-CoA, malonyl-CoA and NADPH.

[0126] As shown below, a FAS DH- effectively replaces the β-ketothiolaseand acetoacetyl-CoA reductase activities of the natural pathway byproducing D(−)-3-hydroxybutyrate as a premature termination product,rather than the usual 16-carbon product, palmitic acid. This prematuretermination product can then be incorporated into PHB by a PHB synthase(See Example 2).

[0127] Another embodiment of the invention employs a recombinantStreptomyces spp. PKS to produce a variety of β-hydroxyCoA esters thatcan serve as monomers for a PHA synthase. One example of a DNA encodinga Type I PKS is the eryA gene cluster, which governs the synthesis oferythromycin aglycone deoxyerythronolide B (DEB). The gene clusterencodes six repeated units, termed modules or synthase units (SUs). Eachmodule or SU, which comprises a series of putative FAS-like activities,is responsible for one of the six elongation cycles required for DEBformation. Thus, the processive synthesis of asymmetric acyl chainsfound in complex polyketides is accomplished through the use of aprogrammed protein template, where the nature of the chemical reactionsoccurring at each point is determined by the specificities in each SU.

[0128] Two other Type I PKS are encoded by the tyl (tylosin) (FIG. 4)and met (methymycin) (FIG. 5) gene clusters. The macrolidemultifunctional synthases encoded by tyl and met provide a greaterdegree of metabolic diversity than that found in the eryA gene cluster.The PKSs encoded by the eryA gene cluster only catalyze chain elongationwith methylmalonylCoA, as opposed to tyl and met PKSs, which catalyzechain elongation with malonylCoA, methylmalonylCoA and ethylmalonylCoA.Specifically, the tyl PKS includes two malonylCoA extender units and oneethylmalonylCoA extender unit, and the met PKS includes one malonylCoAextender unit. Thus, a preferred embodiment of the invention includes,but is not limited to, replacing catalytic activities encoded in met PKSopen reading frame 1 (ORF1) to provide a DNA encoding a protein thatpossesses the required keto group processing capacity and short-chainacylCoA ester starter and extender unit specificity necessary to providea saturated β-hydroxyhexanoylCoA or unsaturated β-hydroxyhexenoylCoAmonomer.

[0129] In order to manipulate the catalytic specificities within eachmodule, DNA encoding a catalytic activity must remain undisturbed. Toidentify the amino acid sequences between the amino acid sequences withcatalytic activity, the “linker regions,” amino acid sequences ofrelated modules, preferably those encoded by more than one gene cluster,are compared. Linker regions are amino acid sequences which are lesswell conserved than amino acid sequences with catalytic activity.Witkowski et al., Eur. J. Biochem., 198, 571 (1991).

[0130] In an alternative embodiment of the invention, to provide a DNAencoding a Type I PKS module with a TE and lacking a functional DH, aDNA encoding a module F, containing KS, MT, KR, ACP, and TE catalyticactivities, is introduced at the 3′ end of a DNA encoding a first module(FIG. 6). Module F introduces the final (R)-3-hydroxyl acyl group at thefinal step of PHA monomer synthesis, as a result of the presence of a TEdomain. DNA encoding a module F is not present in the eryA PKS genecluster (Donadio et al., supra, 1991).

[0131] A DNA encoding a recombinant monomer synthase is inserted into anexpression vector. The expression vector employed varies depending onthe host cell to be transformed with the expression vector. That is,vectors are employed with transcription, translation and/orpost-translational signals, such as targeting signals, necessary forefficient expression of the genes in various host cells into which thevectors are introduced. Such vectors are constructed and transformedinto host cells by methods well known in the art. See Sambrook et al.,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor (1989).Preferred host cells for the vectors of the invention include insect,bacterial, and plant cells. Preferred insect cells include Spodopterafrugiperda cells such as Sf21, and Trichoplusia ni cells. Preferredbacterial cells include Escherichia coli, Streptomyces and Pseudomonas.Preferred plant cells include monocot and dicot cells, such as maize,rice, wheat, tobacco, legumes, carrot, squash, canola, soybean, potato,and the like.

[0132] Moreover, the appropriate subcellular compartment in which tolocate the enzyme in eukaryotic cells must be considered whenconstructing eukaryotic expression vectors. Two factors are important:the site of production of the acetyl-CoA substrate, and the availablespace for storage of the PHA polymer. To direct the enzyme to aparticular subcellular location, targeting sequences may be added to thesequences encoding the recombinant molecules.

[0133] The baculovirus system is particularly amenable to theintroduction of DNA encoding a recombinant FAS or a PKS monomer synthasebecause an increasing variety of transfer plasmids are becomingavailable which can accommodate a large insert, and the virus can bepropagated to high titers. Moreover, insect cells are adapted readily tosuspension culture, facilitating relatively large-scale recombinantprotein production. Further, recombinant proteins tend to be producedexclusively as soluble proteins in insect cells, thus, obviating theneed for refolding, a task that might be particularly daunting in thecase of a large multifunctional protein. The Sf21/baculovirus system hasroutinely expressed milligram quantities of catalytically activerecombinant fatty acid synthase. Finally, the baculovirus/insect cellsystem provides the ability to construct and analyze different synthaseproteins for the ability to polymerize monomers into uniquebiodegradable polymers.

[0134] A further embodiment of the invention is the introduction of atleast one DNA encoding a PHA synthase and a DNA encoding a PHA monomersynthase into a host cell. Such synthases include, but are not limitedto, A. eutrophus 3-hydroxy, 4-hydroxy, and 5-hydroxy alkanoatesynthases, Rhodococcus ruber C₃-C₅ hydroxyalkanoate synthases,Pseudomonas oleororans C₆-C₁₄ hydroxyalkanoate synthases, P. putidaC₆-C₁₄ hydroxyalkanoate synthases, P. aeruginosa C₅-C₁₀ hydroxyalkanoatesynthases, P. resinovorans 6l C ₄-C₁₀ hydroxyalkanoate synthases,Rhodospirillum rubrum C₄-C₇ hydroxyalkanoate syntheses, R. gelatinorusC₄-C₇ , Thiocapsa pfennigii C₄-C₈ hydroxyalkanoate synthases, andBacillus megaterium C₄-C, hydroxyalkanoate synthases.

[0135] The introduction of DNA(s) encoding more than one PHA synthasemay be necessary to produce a particular PHA polymer due to thespecificities exhibited by different PHA synthases. As multifunctionalproteins are altered to produce unusual monomeric structures, synthasespecificity may be problematic for particular substrates. Although theA. eutrophus PHB synthase utilizes only C4 and C5 compounds assubstrates, it appears to be a good prototype synthase for initialstudies since it is known to be capable of producing copolymers of3-hydroxybutyrate and 4-hydroxybutyrate (Kunioka et al., Macromolecules,22, 694 (1989)) as well as copolymers of 3-hydroxyvalerate,3-hydroxybutyrate, and 5-hydroxyvalerate (Doi et al., Macromolecules,19, 2860 (1986)). Other synthases, especially those of Pseudomonasaeruginosa (Timm et al., Eur, J. Biochem., 209, 15 (1992)) andRhodococcus ruber (Pieper et al., FEMS Microbiol. Lett., 96, 73 (1992)),can also be employed in the practice of the invention. Synthasespecificity may be alterable through molecular biological methods.

[0136] In yet another embodiment of the invention, a DNA encoding a FASand a PHA synthase can be introduced into a single expression vector,obviating the need to introduce the genes into a host cell individually.

[0137] A further embodiment of the invention is the generation of a DNAencoding a recombinant multifunctional protein, which comprises a FAS,of either eukaryotic or prokaryotic origin, and a PKS module F. Module Fwill carry out the final chain extension to include two additionalcarbons and the reduction of the β-keto group, which results in a(R)-3-hydroxy acyl CoA moiety.

[0138] To produce this recombinant protein, DNA encoding the FAS TE isreplaced with a DNA encoding a linker region which is normally found inthe ACP-KS interdomain region of bimodular ORFs. DNA encoding a module Fis then inserted 3′ to the DNA encoding the linker region. Differentlinker regions, such as those described below which vary in length andamino acid composition, can be tested to determine which linker mostefficiently mediates or allows the required transfer of the nascentsaturated fatty acid intermediate to module F for the final chainelongation and keto reduction steps. The resulting DNA encoding theprotein can then be tested for expression of long-chain β-hydroxy fattyacids in insect cells, such as Sf21 cells, or Streptomyces, orPseudomonas. The expected 3-hydroxy C-18 fatty acid can serve as apotential substrate for PHA synthases which are able to acceptlong-chain alkyl groups. A preferred embodiment of the invention is aFAS that has a chain length specificity between 4-22 carbons.

[0139] Examples of linker regions that can be employed in thisembodiment of the invention include, but are not limited to, the ACP-KSlinker regions encoded by the tyl ORFI (ACP₁-KS₂; ACP₂-KS₃), and ORF3(ACP₅-KS₆), and eryA ORFI (ACP₁-KS₁; ACP₂-KS₂), ORF2 (ACP₃-KS₄) and ORF3(ACP₅-KS₆).

[0140] This approach can also be used to produce shorter chain fattyacid groups by limiting the ability of the FAS unit to generatelong-chain fatty acids. Mutagenesis of DNA encoding various FAScatalytic activities, starting with the KS, may result in the synthesisof short-chain (R)-3-hydroxy fatty acids.

[0141] The PHA polymers are then recovered from the biomass. Large-scalesolvent extraction can be used, but is expensive. An alternative methodinvolving heat shock with subsequent enzymatic and detergent digestiveprocesses is also available (Byron, Trends Biotechnical, 5, 246 (1987);Holmes, In: Developments in Crystalline Polymers, D. C. Bassett (ed.),pp. 1-65 (1988)). PHB and other PHAs are readily extracted frommicroorganisms by chlorinated hydrocarbons. Refluxing with chloroformhas been extensively used; the resulting solution is filtered to removedebris and concentrated, and the polymer is precipitated with methanolor ethanol, leaving low-molecular-weight lipids in solution. Longerside-chain PHAs show a less restricted solubility than PHB and are, forexample, soluble in acetone. Other strategies adopted include the use ofethylene carbonate and propylene carbonate as disclosed by Lafferty etal. (Chem. Rundschau, 30, 14 (1977)) to extract PHB from biomass.Scandola et al. (int. J. Biol. Microbiol., 10, 373 (1988)) reported that1 M HCl-chloroform extraction of Rhizobium meliloti yielded PHB ofM_(w)=6×10⁴ compared with 1.4×10⁶ when acetone was used.

[0142] Methods are well known in the art for the determination of thePHB or PHA content of microorganisms, the composition of PHAs, and thedistribution of the monomer units in the polymer. Gas chromatography andhigh-pressure liquid chromatography are widely used for quantitative PHBanalysis. See Anderson et al., Microbiol. Rev., 54, 450 (1990) for areview of such methods. NMR techniques can also be used to determinepolymer composition, and the distribution of monomer units.

[0143] Preparation of Variant Nucleic Acid Molecules and VariantPolypeptides of the Invention

[0144] The present invention also contemplates nucleic acid sequenceswhich hybridize under stringent hybridization conditions to the nucleicacid sequences set forth herein. Stringent hybridization conditions arewell known in the art and define a degree of sequence identity greaterthan about 80 to about 90%. Thus, nucleic acid sequences encodingvariant polypeptides (FIG. 38), or nucleic acid sequences havingconservative (silent) nucleotide substitutions (FIG. 37), are within thescope of the invention. Preferably, variant polypeptides encoded by thenucleic acid sequences of the invention are biologically active. Thepresent invention also contemplates naturally occurring allelicvariations and mutations of the nucleic acid sequences described herein.

[0145] As is well known in the art, because of the degeneracy of thegenetic code, there are numerous other DNA and RNA molecules that cancode for the same polypeptides as those encoded by the exemplifiedbiosynthetic genes and fragments thereof. The present invention,therefore, contemplates those other DNA and RNA molecules which, onexpression, encode the polypeptides of, for example, portions of SEQ IDNO:4 or SEQ ID NO:6. Having identified the amino acid residue sequenceencoded by a sugar biosynthetic or macrolide biosynthetic gene, and withknowledge of all triplet codons for each particular amino acid residue,it is possible to describe all such encoding RNA and DNA sequences. DNAand RNA molecules other than those specifically disclosed herein and,which molecules are characterized simply by a change in a codon for aparticular amino acid, are within the scope of this invention.

[0146] The 20 common amino acids and their representative abbreviations,symbols and codons are well known in the art (see, for example,Molecular Biology of the Cell, Second Edition, B. Alberts et al.,Garland Publishing Inc., New York and London, 1989). As is also wellknown in the art, codons constitute triplet sequences of nucleotides inmRNA molecules and as such, are characterized by the base uracil (U) inplace of base thymidine (T) which is present in DNA molecules. A simplechange in a codon for the same amino acid residue within apolynucleotide will not change the structure of the encoded polypeptide.By way of example, it can be seen from SEQ ID NO:6 that a TCT codon forserine exists at nucleotide positions 1735-1737. However, it can also beseen from that same sequence that serine can be encoded by a TCA codon(see, e.g., nucleotide positions 1738-1740) and a TCC codon (see, e.g.,nucleotide positions 1874-1876). Substitution of the latter codons forserine with the TCT codon for serine or vice versa, does notsubstantially alter the DNA sequence of SEQ ID NO:6 and results inproduction of the same polypeptide. In a similar manner, substitutionsof the recited codons with other equivalent codons can be made in a likemanner without departing from the scope of the present invention.

[0147] A nucleic acid molecule, segment or sequence of the presentinvention can also be an RNA molecule, segment or sequence. An RNAmolecule contemplated by the present invention corresponds to, iscomplementary to or hybridizes under stringent conditions to any of theDNA sequences set forth herein. Exemplary and preferred RNA moleculesare mRNA molecules that encode sugar biosynthetic or macrolidebiosynthetic enzymes of this invention.

[0148] Mutations can be made to the native nucleic acid sequences of theinvention and such mutants used in place of the native sequence, so longas the mutants are able to function with other sequences to collectivelycatalyze the synthesis of an identifiable polyketide or macrolide. Suchmutations can be made to the native sequences using conventionaltechniques such as by preparing synthetic oligonucleotides including themutations and inserting the mutated sequence into the gene usingrestriction endonuclease digestion. (See, e.g., Kunkel, T. A. Proc.Natl. Acad, Sci. USA (1985) 82:448; Geisselsoder et al. BioTechniques(1987) 5:786.) Alternatively, the mutations can be effected using amismatched primer (generally 10-20 nucleotides in length) whichhybridizes to the native nucleotide sequence (generally cDNAcorresponding to the RNA sequence), at a temperature below the meltingtemperature of the mismatched duplex. The primer can be made specific bykeeping primer length and base composition within relatively narrowlimits and by keeping the mutant base centrally located. Zoller andSmith, Methods Enzymol., (1983) 100:468. Primer extension is effectedusing DNA polymerase, the product cloned and clones containing themutated DNA, derived by segregation of the primer extended strand,selected. Selection can be accomplished using the mutant primer as ahybridization probe. The technique is also applicable for generatingmultiple point mutations. See, e.g., Dalbie-McFarland et al., Proc.Natl. Acad. Sci. USA (1982) 79:6409. PCR mutagenesis will also find usefor effecting the desired mutations.

[0149] Random mutagenesis of the nucleotide sequence can be accomplishedby several different techniques known in the art, such as by alteringsequences within restriction endonuclease sites, inserting anoligonucleotide linker randomly into a plasmid, by irradiation withX-rays or ultraviolet light, by incorporating incorrect nucleotidesduring in vitro DNA synthesis, by error-prone PCR mutagenesis, bypreparing synthetic mutants or by damaging plasmid DNA in vitro withchemicals. Chemical mutagens include, for example, sodium bisulfite,nitrous acid, hydroxylamine, agents which damage or remove bases therebypreventing normal base-pairing such as hydrazine or formic acid,analogues of nucleotide precursors such as nitrosoguanidine,5-bromouracil, 2-aminopurine, or acridine intercalating agents such asproflavine, acriflavine, quinacrine, and the like. Generally, plasmidDNA or DNA fragments are treated with chemicals, transformed into E.coli and propagated as a pool or library of mutant plasmids.

[0150] Large populations of random enzyme variants can be constructed invivo using “recombination-enhanced mutagenesis.”This method employs twoor more pools of, for example, 10⁶ mutants each of the wild-typeencoding nucleotide sequence that are generated using any convenientmutagenesis technique and then inserted into cloning vectors.

[0151] The gene sequences can be inserted into one or more expressionvectors, using methods known to those of skill in the art. Expressionvectors may include control sequences operably linked to the desiredgenes. Suitable expression systems for use with the present inventioninclude systems which function in eukaryotic and prokaryotic host cells.Prokaryotic systems are preferred, and in particular, systems compatiblewith Streptomyces spp. are of particular interest. Control elements foruse in such systems include promoters, optionally containing operatorsequences, and ribosome binding sites. Particularly usefuil promotersinclude control sequences derived from the gene clusters of theinvention. However, other bacterial promoters, such as those derivedfrom sugar metabolizing enzymes, such as galactose, lactose (lac) andmaltose, will also find use in the expression cassettes encodingdesosamine. Preferred promoters are Streptomyces promoters, includingbut not limited to the ermE*, pikA, and tipA promoters. Additionalexamples include promoter sequences derived from biosynthetic enzymessuch as tryptophan (trp), the β-lactamase (bla) promoter system,bacteriophage lambda PL, and T5. In addition, synthetic promoters, suchas the tac promoter (U.S. Pat. No. 4,551,433), which do not occur innature, also function in bacterial host cells.

[0152] Other regulatory sequences may also be desirable which allow forregulation of expression of the genes relative to the growth of the hostcell. Regulatory sequences are known to those of skill in the art, andexamples include those which cause the expression of a gene to be turnedon or off in response to a chemical or physical stimulus, including thepresence of a regulatory compound. Other types of regulatory elementsmay also be present in the vector, for example, enhancer sequences.

[0153] Selectable markers can also be included in the recombinantexpression vectors. A variety of markers are known which are useful inselecting for transformed cell lines and generally comprise a gene whoseexpression confers a selectable phenotype on transformed cells when thecells are grown in an appropriate selective medium. Such markersinclude, for example, genes which confer antibiotic resistance orsensitivity to the plasmid. Alternatively, several polyketides arenaturally colored and this characteristic provides a built-in marker forselecting cells successfully transformed by the present constructs.

[0154] The various subunits of interest can be cloned into one or morerecombinant vectors as individual cassettes, with separate controlelements, or under the control of, e.g., a single promoter. The subunitscan include flanking restriction sites to allow for the easy deletionand insertion of other subunits so that hybrid PKSs can be generated.The design of such unique restriction sites is known to those of skillin the art and can be accomplished using the techniques described above,such as site-directed mutagenesis and PCR.

[0155] For sequences generated by random mutagenesis, the choice ofvector depends on the pool of mutant sequences, i.e., donor orrecipient, with which they are to be employed. Furthermore, the choiceof vector determines the host cell to be employed in subsequent steps ofthe claimed method. Any transducible cloning vector can be used as acloning vector for the donor pool of mutants. It is preferred, however,that phagemids, cosmids, or similar cloning vectors be used for cloningthe donor pool of mutant encoding nucleotide sequences into the hostcell. Phagemids and cosmids, for example, are advantageous vectors dueto the ability to insert and stably propagate therein larger fragmentsof DNA than in M13 phage and λ phage, respectively. Phagemids which willfind use in this method generally include hybrids between plasmids andfilamentous phage cloning vehicles. Cosmids which will find use in thismethod generally include λ phage-based vectors into which cos sites havebeen inserted. Recipient pool cloning vectors can be any suitableplasmid. The cloning vectors into which pools of mutants are insertedmay be identical or may be constructed to harbor and express differentgenetic markers (see, e.g., Sambrook et al., supra). The utility ofemploying such vectors having different marker genes may be exploited tofacilitate a determination of successful transduction.

[0156] Thus, for example, the cloning vector employed may be an E.coli/Streptomyces shuttle vector (see, for example, U.S. Pat. Nos.4,416,994, 4,343,906, 4,477,571, 4,362,816, and 4,340,674), a cosmid, aplasmid, an artificial bacterial chromosome (see, e.g., Zhang and Wing,Plant Mol. Biol., 35, 115 (1997); Schalkwyk et al., Curr. Op. Biotech.,6, 37 91995); and Monaco and Lavin, Trends in Biotech., 12, 280 (1994),or a phagemid, and the host cell may be a bacterial cell such as E.coli, Penicillium patulum, and Streptomyces spp. such as S. lividans, S.venezuelae , or S. lavendulae, or a eukaryotic cell such as fungi, yeastor a plant cell, e.g., monocot and dicot cells, preferably cells thatare regenerable.

[0157] Moreover, recombinant polypeptides having a particular activitymay be prepared via “gene-shuffling”. See, for example, Crameri et al.,Nature, 391, 288 (1998); Patten et al., Curr. Op. Biotech., 8, 724(1997), U.S. Pat. Nos. 5,837,458, 5,834,252, 5,830,727, 5,811,238,5,605,793).

[0158] For phagemids, upon infection of the host cell which contains aphagemid, single-stranded phagemid DNA is produced, packaged andextruded from the cell in the form of a transducing phage in a mannersimilar to other phage vectors. Thus, clonal amplification of mutantencoding nucleotide sequences carried by phagemids is accomplished bypropagating the phagemids in a suitable host cell.

[0159] Following clonal amplification, the cloned donor pool of mutantsis infected with a helper phage to obtain a mixture of phage particlescontaining either the helper phage genome or phagemids mutant alleles ofthe wild-type encoding nucleotide sequence.

[0160] Infection, or transfection, of host cells with helper phage isgenerally accomplished by methods well known in the art (see., e.g.,Sambrook et al., supra; and Russell et al. (1986) Gene 45:333-338).

[0161] The helper phage may be any phage which can be used incombination with the cloning phage to produce an infective transducingphage. For example, if the cloning vector is a cosmid, the helper phagewill necessarily be a λ phage. Preferably, the cloning vector is aphagemid and the helper phage is a filamentous phage, and preferablyphage M13.

[0162] If desired after infecting the phagemid with helper phage andobtaining a mixture of phage particles, the transducing phage can beseparated from helper phage based on size difference (Barnes et al.(1983) Methods Enzymol. 101:98-122), or other similarly effectivetechnique.

[0163] The entire spectrum of cloned donor mutations can now betransduced into clonally amplified recipient cells into which has beentransduced or transformed a pool of mutant encoding nucleotidesequences. Recipient cells which may be employed in the method disclosedand claimed herein may be, for example, E. coli, or other bacterialexpression systems which are not recombination deficient. Arecombination deficient cell is a cell in which recombinatorial eventsis greatly reduced, such as rec- mutants of E. coli (see, Clark et al.(1965) Proc. Natl. Acad. Sci. USA 53:451-459).

[0164] These transductants can now be selected for the desired expressedprotein property or characteristic and, if necessary or desirable,amplified. Optionally, if the phagemids into which each pool of mutantsis cloned are constructed to express different genetic markers, asdescribed above, transductants may be selected by way of theirexpression of both donor and recipient plasmid markers.

[0165] The recombinants generated by the above-described methods canthen be subjected to selection or screening by any appropriate method,for example, enzymatic or other biological activity.

[0166] The above cycle of amplification, infection, transduction, andrecombination may be repeated any number of times using additional donorpools cloned on phagemids. As above, the phagemids into which each poolof mutants is cloned may be constructed to express a different markergene. Each cycle could increase the number of distinct mutants by up toa factor of 10^(6.) Thus, if the probability of occurrence of aninter-allelic recombination event in any individual cell is f (aparameter that is actually a function of the distance between therecombining mutations), the transduced culture from two pools of 10⁶allelic mutants will express up to 10¹² distinct mutants in a populationof 10¹²/f cells.

[0167] The invention will be further described by the followingnon-limiting examples.

I. Experimental Procedures

[0168] Materials and Methods

[0169] Materials. Sodium R-(−)-3-hydroxybutyrate, coenzyme-A,ethylchloroforrnate, pyridine and diethyl ether were purchased fromSigma Chemical Co. Amberlite IR-120 was purchased from Mallinckrodt Inc.6-O-(N-Heptylcarbamoyl)methyl α-D-glycopyranoside (Hecameg) was obtainedfrom Vegatec (Villeejuif, France). Two-piece spectrophotometer cellswith pathlengths of 0.1 (#20/0-Q-1) and 0.01 cm (#20/0-Q-0.1) wereobtained from Stama Cells Inc. (Atascadero, Calif.). Rabbit anti-A.eutrophus PHA synthase antibody was a gracious gift from Dr. F. Sriencand S. Stoup (Biological Process Technology Institute, University ofMinnesota). Sf21 cells and T. ni cells were kindly provided by GregFranzen (R&D Systems, Minneapolis, Minn.) and Stephen Harsch (Departmentof Veterinary Pathobiology, University of Minnesota), respectively.

[0170] Plasmid pFAS206 and a recombinant baculoviral clone encodingFAS206 (Joshi et al., J. Biol. Chem., 268, 22508 (1993)) were generousgifts of A. Joshi and S. Smith. Plasmid pAet41(Peoples et al., J. Biol.Chem., 264, 15298 (1989)), the source of theA. eutrophus PHB synthase,was obtained from A. Sinskey. Baculovirus transfer vector, pBacPAK9, andlinearized baculoviral DNA, were obtained from Clontech Inc. (Palo Alto,Calif.). Restriction enzymes, T4 DNA ligase, E. coli DH5α competentcells, molecular weight standards, lipofectin reagent, Grace's insectcell medium, fetal bovine serum (FBS), and antibiotic/antimycoticreagent were obtained from GIBCO-BRL (Grand Island, N.Y.). Tissueculture dishes were obtained from Coming Inc. Spinner flasks wereobtained from Bellco Glass Inc. Seaplaque agarose GTG was obtained fromFMC Bioproducts Inc.

[0171] Methods

[0172] Preparation of R-3HBCoA. R-(−)-3 HBCoA was prepared by the mixedanhydride method described by Haywood et al., FEMS Microbiol. Lett., 57,1(1989). 60 mg (0.58 nmol) of R-(−)-3 hydroxybutyric acid was freezedried and added to a solution of 72 mg of pyridine in 10 ml diethylether at 0° C. Ethylchloroformate (100 mg) was added, and the mixturewas allowed to stand at 4° C. for 60 minutes. Insoluble pyridinehydrochloride was removed by centrifugation. The resulting anhydride wasadded, dropwise with mixing, to a solution of 100 mg coenzyme-A (0.13mmol) in 4 ml 0.2 M potassium bicarbonate, pH 8.0 at 0° C. The reactionwas monitored by the nitroprusside test of Stadtman, Meth., Enzynol., 3,931 (1957), to ensure sufficient anhydride was added to esterify all thecoenzyme-A. The concentration of R-3-HBCoA was determined by measuringthe absorbance at 260 nm (e=16.8 nM¹ cm¹; 18).

[0173] Construction of pBP-phbC. The phbC gene (approximately 1.8 kb)was excised from pAet41(Peoples et al., J. Biol. Chem., 264, 15293(1989)) by digestion with BstBI and StuI, purified as described byWilliams et al. (Gene, 109, 445 (1991)), and ligated to pBacPAK9digested with BstBI and StuI. This resulted in pBP-phbC, the baculovirustransfer vector used in formation of recombinant baculovirus particlescarrying phbC.

[0174] Large-scale expression of PHA synthase. A 1 L culture of T. nicells (1.2×10⁶ cells/ml) in logarithmic growth was infected by theaddition of 50 ml recombinant viral stock solution (2.5×10⁸ pfu/ml)resulting in a multiplicity of infection (MOI) of 10. This infectedculture was split between two Bellco spinners (350 ml/500 ml spinner,700 ml/1 L spinner) to facilitate oxygenation of the culture. Thesecultures were incubated at 28° C. and stirred at 60 rpm for 60 hours.Infected cells were harvested by centrifugation at 1000×g for 10 minutesat 4° C. Cells were flash frozen in liquid N₂ and stored in 4 equalaliquots, at −80° C. until purification.

[0175] Insect cell maintenance and recombinant baculovirus formation.Sf21 cells were maintained at 26-28° C. in Grace's insect cell mediumsupplemented with 10% FBS, 1.0% pluronic F68, and 1.0%antibiotic/antimycotic (GIBCO-BRL). Cells were typically maintained insuspension at 0.2-2.0×10⁶/ml in 60 ml total culture volume in 100 mlspinner flasks at 55-65 rpm. Cell viability during the culture periodwas typically 95-100%. The procedures for use of the transfer vector andbaculovirus were essentially those described by the manufacturer(Clontech, Inc.). Purified pBP-phbC and linearized baculovirus DNA wereused for cotransfection of Sf21 cells using the liposome-mediated method(Felgner et al., Proc. Natl. Acad. Sci. USA, 84, 7413 (1987)) utilizingLipofectin (GIBCO-BRL). Four days later cotransfection supernatants wereutilized for plaque purification. Recombinant viral clones were purifiedfrom plaque assay plates containing 1.5% Seaplaque GTG after 5-7 days at28° C. Recombinant viral clone stocks were then amplified in T25-flaskcultures (4 ml, 3×10⁶/ml on day 0) for 4 days; infected cells weredetermined by their morphology and size and then screened by SDS/PAGEusing 10% polyacrylamide gels (Laemmli, Nature, 222, 680 (1970)) forproduction of PHA synthase.

[0176] Purification of PHA synthase from BTI-TN-5BI-4 T. ni cells.Purification of PHA synthase was performed according to the method ofGerngross et al., Biochemistry 33, 9311 (1994) with the followingalterations. One aliquot (110 mg protein) of frozen cells was thawed onice and resuspended in 10 mM KPi (pH 7.2), 5% glycerol, and 0.05%Hecameg (Buffer A) containing the following protease inhibitors at theindicated final concentrations: benzarnidine (2 mM),phenylmethylsulfonyl fluoride (PMSF, 0.4 mM), pepstatin (2 mg/ml),leupeptin (2.5 mg/ml), and Na-p-tosyl-l-lysine chloromethyl ketone(TLCK, 2 mM). EDTA was omitted at this stage due to its incompatibilitywith hydroxylapatite (HA). This mixture was homogenized with threeseries of 10 strokes each in two Thomas homogenizers while partiallysubmerged in an ice bath and then sonicated for 2 minutes in a BransonSonifier 250 at 30% cycle, 30% power while on ice. All subsequentprocedures were carried out at 4° C.

[0177] The lysate was immediately centrifuged at 100000×g in a Beckman50.2 Ti rotor for 80 minutes, and the resulting supernatant (10.5 ml, 47mg) was immediately filtered through a 0.45 mm Uniflow filter(Schleicher and Schuell Inc., Keene, N.H.) to remove any remaininginsoluble matter. Aliquots of the soluble fraction (1.5 ml, 7 mg) wereloaded onto a 5 ml BioRad Econo-Pac HTP column that had beenequilibrated with Buffer A (+ protease inhibitor mix) attached to aBioRad Econo-system, and the column was washed with 30 ml Buffer A. Allchromatographic steps were carried out at a flow rate of 0.8 ml/minute.PHA synthase was eluted form the HA column with a 32×32 ml lineargradient from 10 to 300 mM KPi.

[0178] Fraction collection tubes were prepared by addition of 30 ml of100 mM EDTA to provide a metalloprotease inhibitor at 1 mM immediatelyafter HA chromatography. PHA synthase was eluted in a broad peak between110-180 mM KPi. Fractions (3 ml) containing significant PHA synthaseactivity were pooled and stored at 0° C. until the entire solublefraction had been run through the chromatographic process. Pooledfractions then were concentrated at 4° C. by use of a Centriprep-30concentrator (Amicon) to 3.8 mg/ml. Aliquots (0.5 ml) were either flashfrozen and stored in liquid N₂ or glycerol was added to a finalconcentration of 50% and samples (1.9 mg/ml) were stored at −20° C.

[0179] Western analysis. Samples of T. ni cells were fractionated bySDS-PAGE on 10% polyacrylamide gels, and the proteins then weretransferred to 0.2 mm nitrocellulose membranes using a BioRad TransblotSD Semi-Dry electrophoretic transfer cell according to the manufacturer.Proteins were transferred for 1 hour at 15 V. The membrane was rinsedwith doubly distilled H₂O, dried, and treated with phosphate-bufferedsaline (PBS) containing 0.05% Tween-20 (PBS-Tween) and 3% nonfat drymilk to block non-specific binding sites. Primary antibody (rabbitanti-PHA synthase) was applied in fresh blocking solution and incubatedat 25° C. for 2 hours. Membranes were then washed four times for 10minutes with PBS-Tween followed by the addition of horseradishperoxidase-conjugated goat-anti-rabbit antibody (Boehringer-Mannheim)diluted 10,000× in fresh blocking solution and incubated at 25° C. for 1hour. Membranes were washed finally in three changes (10 minutes) ofPBS, and the immobilized peroxidase label was detected using thechemiluminescent LumiGLO substrate kit (Kirkegaard and Perry,Gaithersburg, Md.) and X-ray film.

[0180] N-ternninal analysis. Approximately 10 mg of purified PHAsynthase was run on a 10% SDS-polyacrylamide gel, transferred to PVDF(Immobilon-PSQ, Millipore Corporation, Bedford, Mass.), stained withAmido Black, and sequenced on a 494 Procise Protein Sequencer(Perkin-Elmer, Applied Biosystems Division, Foster City, Calif.).

[0181] Double-infection protocol. Four 100 ml spinner flasks were eachinoculated with 8×10⁷ cells in 50 ml of fresh insect medium. To flask 1,an additional 20 ml of fresh insect medium was added (uninfectedcontrol); to flask 2, 10 ml BacPAK6::phbC viral stock (1×10⁸ pfu/ml) and10 ml fresh insect medium were added; to flask 3, 10 ml BacPAK6::FAS206viral stock (1×10⁸ pfu/ml) and 10 ml fresh insect medium were added; andto flask 4, 10 ml BacPAK6::phbC viral stock (1×10⁸ pf/ml) and 10 mlBacPAK6::FAS206 viral stock (1×10⁸ pfu/ml) were added. These viralinfections were carried out at a multiplicity of infection ofapproximately 10. Cultures were maintained under normal growthconditions and 15 ml samples were removed at 24, 48, and 72 hour timepoints. Cells were collected by gentle centrifuigation at 1000×g for 5minutes, the medium was discarded, and the cells were immediately storedat −70° C.

[0182] PHA synthase assays. Coenzyme A released by PHA synthase in theprocess of polymerization was monitored precisely as described byGerngross et al. (supra) using 5,5′-dithiobis (2-nitrobenzoic acid,DTNB) (Ellman, Arch. Biochem. Biophys., 82, 70 (1959)).

[0183] The presence of HBCoA was monitored spectrophotometrically.Assays were performed at 25° C. in a Hewlett Packard 8452A diode arrayspectrophotometer equipped with a water-jacketed cell holder. Two-pieceStarna Spectrosil spectrophotometer cells with pathlengths of 0.1 and0.01 cm were employed to avoid errors arising from the compression ofthe absorbance scale at higher values. Absorbance was monitored at 232nm, and E₂₃₂ nm of 4.5×103 M⁻¹ cm⁻¹ was used in calculations. One unit(U) of enzyme is the amount required to hydrolyze 1 mmol of substrateminute⁻¹. Buffer (0.15 M KPi, pH 7.2) and substrate were equilibrated to25° C. and then combined in an Eppendorf tube also at 25° C. Enzyme wasadded and mixed once in the pipet tip used to transfer the entiremixture to the spectrophotometer cell. The two-piece cell wasimmediately assembled, placed in the spectrophotometer with the cellholder (type CH) adapted for the standard 10 mm pathlength cell holderof the spectrophotometer. Manipulations of sample, from mixing toinitiation of monitoring, took only 10-15 seconds. Absorbance wascontinually monitored for up to 10 minutes. Calibration of reactions wasagainst a solution of buffer and enzyme (no substrate) which led toabsorbance values that represented substrate only.

[0184] PHBs assay. PHB was assayed from Sf21 cell samples according tothe propanolysis method of Riis et al., J. Chromo., 445, 285 (1988).Cell pellets were thawed on ice, resuspended in 1 ml cold ddH₂O andtransferred to 5 ml screwtop test tubes with teflon seals. Two ml ofddH₂O were added, the cells were washed and centrifuged and then 3 ml ofacetone were added and the cells washed and centrifuged. The sampleswere then desiccated by placing them in a 94° C. oven for 12 hours. Thefollowing day 0.5 ml of 1,2-dichloroethane, 0.5 ml acidified propanol(20 ml HCl, 80 ml 1-propanol) and 50 ml benzoic acid standard were addedand the sealed tubes were heated to 100° C. in a boiling water bath for2 hours with periodic vortexing. The tubes were cooled to roomtemperature and the organic phase was used for gas-chromatographic (GC)analysis using a Hewlett Packard 5890A gas chromatograph equipped with aHewlett Packard 7673A automatic injector and a fused silica capillarycolumn, DB-WAX 30W of 30 meter length. Positive samples were furthersubjected to GC-mass spectrometric (MS) analysis for the presence ofpropylhydroxybutyrate using a Kratos MS25 GC/MS. The followingparameters were used: source temperature, 210° C.; voltage, 70 eV; andaccelerating voltage, 4 KeV.

[0185] Catalytic activities

[0186] Ketoacyl synthase (KS) activity was assessed radiochemically bythe condensation-¹⁴CO₂ exchange reaction (Smith et al., PNAS USA, 73,1184 (1976)).

[0187] Transferase (AT) activity was assayed, using malonyl-CoA as donorand pantetheine as acceptor, by determining spectrophotometrically thefree CoA released in a coupled ATP citrate-lyase-malate dehydrogenasereaction (see, Rangen et al., J. Biol. Chem., 266, 19180 (1991).

[0188] Ketoreductase (KR) was assayed spectrophotometrically at 340 nm:assay systems contained 0.1 M potassium phosphate buffer (pH 7), 0.15 mMNADPH, enzyme and either 10 mM trans-1-decalone or 0.1 mMacetoacetyl-CoA substrate.

[0189] Dehydrase (DH) activity was assayed spectrophotometrically at 270nm using S-DL-β-hydyroxybutyryl N-acetylcysteamine as substrate (Kumaret al., J. Biol, Chem., 245, 4732 (1970)).

[0190] Enoyl reductase (ER) activity was assayed spectrophotometricallyat 340 nm essentially as described by Strom et al. (J. Biol, Chem, 254,8159 (1979)); the assay system contained 0.1 M potassium phosphatebuffer (pH 7), 0.15 mM NADPH, 0.375 nM crotonoyl-CoA, 20 μM CoA andenzyme.

[0191] Thioesterase (TE) activity was assessed radiochemically byextracting and assaying the [¹⁴C]palmitic acid formed from[1-¹⁴C]palmitoyl-CoA during a 3 minute incubation Smith, Meth. Enzymol.,71C, 181 (1981); the assay was in a final volume of 0.1 ml, 25 mMpotassium phosphate buffer (pH 8), 20 μM [1-¹⁴C]palmitoyl-CoA (20 nCi)and enzyme.

[0192] Assay of overall fatty acid synthase activity was performedspectrophotometrically as described previously by Smith et al. (Meth.Enzymol., 35, 65 (1975)). All enzyme activities were assayed at 37° C.except the transferase, which was assayed at 20° C. Activity unitsindicate umol of substrate consumed/minute. All assays were conducted,at a minimum, at two different protein concentrations with theappropriate enzyme and substrate blanks included.

II. EXAMPLES Example 1 Expression of A. Eutrophus PHA Synthase Using aBaculovirus System

[0193] Recent work has shown that PHA synthase from A. eutrophus can beoverexpressed in E. coli, in the absence of 3-ketothiolase andacetoacetyl-CoA reductase (Gerngross et al., supra) and can be expressedin plants (See Poirier et al., Biotech, 13, 142 (1995) for a review).Isolation of the soluble form of PHA synthase provides opportunities toexamine the mechanistic details of the priming and initiation reactions.Because the baculovirus system has been successful for the expression ofa number of prokaryotic genes as soluble proteins, and insect cells,unlike bacterial expression systems, carry out a wide array ofpost-translational modifications, the baculovirus expression systemappeared ideal for the expression of large quantities of soluble PHAsynthase, a protein that must be modified by phosphopantetheine in orderto be catalytically active (Gerngross et al., supra).

[0194] Purification of PHA synthase. The purification procedure employedfor PHA synthase is a modification of Gerngross et al. (supra) involvingthe elimination of the second liquid chromatographic step and inclusionof a protease-inhibitor cocktail in all buffers. All steps were carriedout on ice or at 4° C. except where noted. Frozen cells were thawed onice in 10 ml of Buffer A (10 mM KPi, pH 7.2, 05% glycerol, and 0.05%Hecameg) and then immediately homogenized prior to centrifugation and HAchromatography.

[0195] The results of these efforts are summarized in Table 1 and FIG.7. A prominent band at 64 kDa is visible in total, soluble, and HAeluate protein samples fractionated by SDS/PAGE (lanes 4, 5, and 6 ofFIG. 7, respectively). The initial specific activity of the isolated PHAsynthase was 20-fold higher than previous attempts at expression andpurification of this polypeptide. Approximately 1000 units of PHBsynthase have been purified, based on calculations from the directspectrophotometric assay detailed below, with an overall recovery ofactivity of 70%. The large proportion of synthase present in themembrane fraction, and the fact that over 90% of the initial activitywas found in the soluble fraction, suggest either that the synthase inthe membrane fraction is in an inactive form or that the direct assay isnot applicable to the initial, 12 U/mg, crude extract. TABLE 1Purification of PHA Synthase protein specific sample total units vol(mL) (mg) (mg/ml) activity recovery total 1430 11.5 113 9.8 12.7 100protein soluble 1340 10.5 47 4.5 28.6 93 protein pooled 1020 7.9 30 3.834.2 71 HA fractions

[0196] N-terminal sequencing of the 64 kDa protein confirmed itsidentity as PHA synthase (FIG. 8). Two prominent N-termini, at aminoacid residue 7 (alanine) and residue 10 (serine) were obtained in a 3:2ratio. This heterogeneous N-terninus presumably is the result ofaminopeptidase activity. Western analysis using a rabbit-anti-PHAsynthase antibody corroborated the results of the sequencing andindicated the presence of at least three bands that resulted fromproteolysis of PHA synthase (FIG. 7B, lanes 4-6). The antibody wasspecific for PHA synthase since neither T. ni nor baculoviral proteinsshowed reactivity (FIG. 7B, lanes 2 and 3). N-terminal proteinsequencing (FIG. 8) showed directly that the 44 kDa (band b) and 32 kDa(band d) proteins were derived from PHA synthase (fragments beginning atA181/N185 and at G387, respectively). The 35-40 kDa (band c) proteingave low sequencing yields and may contain a blocked N-terminus.Inspection of FIG. 7B suggests that most degradation occurs followingcell disruption since the total protein sample of this gel (lane 4) wasprepared by boiling intact cells directly in SDS sample buffer while theHA sample (lane 6) went through the purification procedure describedabove.

[0197] Assay of synthase activity. Due to the significant level ofexpression obtained using the baculovirus system, the synthase activitycould be assayed spectrophotometrically by monitoring hydrolysis of thethioester bond at 232 nm, the wavelength at which there is a maximumdecrease in absorbance upon hydrolysis. The difference between substrate(HBCoA) and product (CoA) at this wavelength is shown in FIG. 9.Absorbance of HBCoA and CoA at 232 nm occurs at a trough between twowell-separated peaks. Assays were carried out at pH 7.2 for comparativeanalysis with previous studies (Gerngross et al., supra). Substrate(R-(−)3-HBCoA) substrate for these studies was prepared using the mixedanhydride method (Haywood et al., supra), and its concentration wasdetermined by measuring A₂₆₀. The short pathlength cells (0.1 cm and0.01 cm) allowed use of relatively high reaction concentrations whileconserving substrate and enzyme. Assay results showed an initial lagperiod of 60 seconds prior to the linear decrease in A₂₃₂, andvelocities were determined from the slope of these linear regions of theassay curves. The length of the lag period was variable and wasinversely related to enzyme concentration. These data are consistentwith those using PHA synthase purified from E. coli (Gerngross et al.,supra).

[0198]FIGS. 10 and 11 show the V versus S and 1/V versus 1/S plots,respectively. The double reciprocal plot was concave upward which issimilar to results obtained from studies of the granular PHA synthasefrom Zooglea ramigera (Fukui et al., Arch. Microbiol., 110, 149 (1976))and suggests a complex reaction mechanism. Examinations of velocity andspecific activity as a function of enzyme concentration are shown inFIGS. 12 and 13. These results confirm that specific activity of thesynthase depends upon enzyme concentration. The pH activity curve for A.eutrophus PHA synthase purified from T. ni cells is shown in FIG. 14.The curve shows a broad activity maximum centered around pH 8.5. Thisresult agrees well with prior work on the A. eutrophus PHB synthasealthough it is significantly different than results obtained for the PHBsynthase from Z. ramigera for which the optimum was determined to be pH7.0.

[0199] The effect of varying enzyme concentration in the presence of afixed amount of substrate revealed an intriguing trend (FIG. 15). Fromthese data it appears that the extent of polymerization is dependent onthe amount of enzyme included in the reaction mixture. This could beexplained if there is a “terminal length” limitation of the polymer,which, once reached, cannot be extended any further. If this is thecase, it would also suggest that termination of the polymerizationreaction, the release of the synthase from the polymer, and/orreinitiation of polymerization by the newly released synthase arerelatively slow events since no evidence of these reactions are seenwithin the time course of these studies. The phenomenon observed in FIG.15 is not the result of decay of the enzyme over the course of the assaysince virtually identical results are obtained following a 10 minutepreincubation of the synthase at 25° C.

[0200] It must also be noted that comparisons of the directspectrophotometric assays used here and the more common assay involvingthe use of Ellman's reagent, DTNB, (Ellman, supra) in the formation ofthiolate of coenzyme-A showed that the values determined by the directmethod were approximately 70% of the values determined using Ellman'sreagent. This may be due to phase separation occurring in the cuvettesas the relatively insoluble polymer is formed. In support of thisnotion, a faint haze or opalescence in the cuvette developed during thecourse of the reaction, particularly at higher substrate concentrations.

[0201] PHA synthase purified from insect cells appears to be relativelystable. Examination of activity following storage, in liquid N₂ and at−20° C. in the presence of 50% glycerol showed that approximately 50% ofsynthase activity remained after 7 weeks when stored in liquid N₂ andapproximately 75% of synthase activity remained after 7 weeks whenstored at −20° C. in the presence of 50% glycerol.

[0202] The expression of PHA synthase from A. eutrophus in a baculovirusexpression system results in the synthase constituting approximately 50%of total protein 60 hours post-infection; however, approximately 50-75%of the synthase is observed in the membrane-associated fraction. Thiselevated level of expression allowed purification of the soluble PHAsynthase using a single chromatographic step on HA. The purity of thispreparation is estimated to be approximately 90% (intact PHA synthaseand 3 proteolysis products).

[0203] The initial specific activity of 12 U/mg was approximately20-fold higher than the most successful previous efforts atoverexpression of A. eutrophus PHA synthase. The synthase reported herewas isolated from a 250 ml culture with 70% recovery which represents animprovement of 500-fold (1000 U /64 U×8 L /0.25 L) when compared to an 8L E. coli culture with 40% recovery. This high expression level shouldprovide sufficient PHA synthase for extensive structural, fuictional,and mechanistic studies. Furthermore, it is clear that the baculovirusexpression system is an attractive option for isolation of other PHAsynthases from various sources.

[0204] PHA synthase produced in the baculovirus system was of sufficientpotency to allow direct spectrophotometric analysis of the hydrolysis ofthe thioester bond of HBCoA at 232 nm. These assays revealed a lagperiod of approximately 60 seconds, the length of which was variable andinversely related to enzyme concentration. Such a lag period presumablyreflects a slow step in the reaction, perhaps correlating todimerization of the enzyme, the priming, and/or initiation steps information of PHB. Size exclusion chromatographic examination of the PHBsynthase native MW indicated two forms of the synthase. One form showeda MW of approximately 100-160 kDa and the other showed a MW ofapproximately 50-80 kDA; these two forms likely represent the dimer andmonomer of PHA synthase, respectively. Similar results have beenreported previously in which two forms of approximately 60 and 130 kDawere observed. Comparisons of the direct assay reported here and theindirect assay using DTNB revealed that the former resulted in valuesthat were 70% of the values determined by the DTNB indirect assay.Although the reason for this difference has not been examined in detail,it is probable that the apparent phase separation that occurred upon PHBformation in the short pathlength cuvettes used, particularly with high[HBCoA], results in this discrepancy.

[0205] Enzymatic analyses of the PHA synthase have found that the enzymehas a broad pH optimum centered at pH 8.5; however, the studiesdescribed herein have been performed at pH 7.2 to provide comparativevalues with the results of others. Moreover, the specific activity ofthis enzyme is dependent upon enzyme concentration which confirms andextends earlier results (Gerngross et al., supra).

[0206] In studies intended to examine the dependence of activity uponenzyme concentration, it became apparent that the extent of thepolymerization reaction is dependent on the amount of enzyme included inthe reaction mixture. Specifically, decreasing the amount of enzymeleads not only to decreased velocity of reaction but also to a decreasedextent of condensation (FIG. 15). One possible explanation is that theenzyme is thermally labile; however, identical assays in which theenzyme is preincubated at 25° C. for 10 minutes prior to initiation ofthe reaction had similar results. Another possibility is that aterminal-length of the polymer is reached precluding furthercondensations until the particular synthase molecule is released fromthe terminal-length polymer.

[0207] This work clearly demonstrates the value of the baculovirusexpression system for the production of A. eutrophus PHA synthase andfor the potential application to studies of other PHA synthases.Furthermore, the high level of expression obtained using the baculoviralsystem should allow convenient analysis for substrate-specificity andstructure-function studies of PHA synthases from relatively crude insectcell extracts.

Example 2 Co-expression of Rat FAS Dehydrase Mutant cDNA and PHBSynthase Gene in Insect Cells

[0208] Expression of a rat FAS DH- cDNA in Sf9 cells has been reportedpreviously (Rangan et al., J. Biol. Chem, 266, 19180 (1991); Joshi etal., Biochem, J., 226, 143 (1993)). Once activity of the phbC geneproduct had been established in insect cells (see Example 1),baculovirus clones containing the rat FAS DH- cDNA and BacPAK6:.phbCwere employed in a double-infection strategy to determine if PHB wouldbe produced in insect cells. It was not known if an intracellular poolof R(−)-3-hydroxybutyrate would be stable or available as a substratefor the PHB synthase. In order for the R-(−)-3-hydroxybutyrylCoA to beavailable as a substrate, the R-(−)-3-hydroxybutyrylCoA released fromrat FAS DH- protein must be trapped by the PHB synthase and incorporatedinto a polymer at a rate faster than β-oxidation, which would regenerateacetylCoA. It was also not known if the stereochemical configuration ofthe 3-hydroxyl group, which must be in the R form, would be recognizedas a substrate by PHB synthase. Fortunately, previous biochemicalstudies on eukaryotic FASs indicated that the R form of3-hydroxybutyrylCoA would be generated (Wakil et al., J. Biol. Chem.,237, 687 (1962)).

[0209] SDS-PAGE of protein samples from a time course of uninfected,single-infected, and dual-infected Sf21 cells was performed (FIG. 16).From these data, it is clear that the rat FAS DH mutant and PHB synthasepolypeptides are efficiently co-expressed in Sf21 cells. However,co-expression results in ˜50% reduced levels of both polypeptidescompared to Sf21 cells that are producing the individual proteins.Western analysis using anti-rat FAS (Rangan et al., supra) and anti-PHAsynthase antibodies confirmed simultaneous production of thecorresponding proteins.

[0210] To provide further evidence that PHB was being synthesized ininsect cells, T. ni cells which had been infected with a baculovirusvector encoding rat FAS DH⁰ and/or a baculovirus vector encoding PHAsynthase were analyzed for the presence of granules. Infected cells werefixed in paraformaldehyde and incubated with anti-PHA synthaseantibodies (Williams et al., Protein Exp. Purif., 7, 203 (1996)).Granules were observed only in doubly infected cells (Williams et al.,App. Environ. Micro., 62 2540 (1996)).

[0211] Characterization of PHB production in insect cells. In order todetermine if de novo synthesis of PHB was occurring in Sf21 cells thatco-express the rat FAS DH mutant and PHB synthase, fractions of thesesamples were extracted, the extract subjected to propanolysis, andanalyzed for the presence of propylhydroxybutyrate by gas chromatography(FIG. 17). A unique peak with a retention time that coincided with apropylhydroxybutyrate standard was detected only in the double infectionsamples at 48 and 72 hours, in contrast to the individually expressedgene products and uninfected controls, which were negative. Thesesamples were analyzed further by GC/MS to confirm the identity of theproduct. FIG. 18 shows mass spectroscopy data corresponding to thematerial obtained from peak 10.1 in the gas chromatograph compared to apropylhydroxybutyrate standard. The results show that PHB synthesis isoccurring only in Sf21 cells co-expressing the rat FAS DH mutant cDNAand the phbC gene from A. eutrophus. Integration of the peak in the gaschromatograph corresponding to propylhydroxybutyrate revealed thatapproximately 1 mg of PHB was isolated from 1 liter culture of Sf21cells (approximately 600 mg dry cell weight of Sf21 cells). Thus, theratFAS206 protein effectively replaces the β-ketothiolase andacetoacetyl-CoA reductase functions, resulting in the production of PHBby a novel pathway.

[0212] The approach described here provides a new strategy to combinemetabolic pathways that are normally engaged in primary anabolicfunctions for production of polyesters. The premature termination of thenormal fatty acid biosynthetic pathway to provide suitably modifiedacylCoA monomers for use in PHA synthesis can be applied to bothprokaryotic and eukaryotic expression since the formation of polymerwill not be dependent on specialized feedstocks. Thus, once arecombinant PHA monomer synthase is introduced into a prokaryotic oreukaryotic system, and co-expressed with the appropriate PHA synthase,novel bipolymer formation can occur.

Example 3 Cloning and Sequencing of the Vep ORFI PKS Gene Cluster

[0213] The entire PKS cluster form Streptomyces venezuelae was clonedusing a heterologous hybridization strategy. A 1.2 kb DNA fragment thathybridized strongly to a DNA encoding an eryA PKS β-ketoacyl synthasedomain was cloned and used to generate a plasmid for gene disruption.This method generated a mutant strain blocked in the synthesis of theantibiotic. A S. venezuelae genomic DNA library was generated and usedto clone a cosmid containing the complete methymycin aglycone PKS DNA.Fine-mapping analysis was performed to identify the order and sequenceof catalytic domains along the multifunctional PKS (FIG. 19). DNAsequence analysis of the vep ORFI showed that the order of catalyticdomains is KS^(Q)/AT/ACP/KS/AT/KR/ACP/KS/AT/DH/KR/ACP. The complete DNAsequence, and corresponding amino acid sequence, of the vep ORFI isshown in FIG. 23 (SEQ ID NO:1 and SEQ ID NO:2, respectively).

[0214] The sequence data indicated that the PKS gene cluster encodes apolyene of twelve carbons. The vep gene cluster contains 5 polyketidesynthase modules, with a loading module at its 5′ end and an endingdomain at its 3′ end. Each of the sequenced modules includes a keto-ACP(KS), an acyltransferase (AT), a dehydratase (DH), a keto-reductase(KR), and an acyl carrier protein domain. The six acyltransferasedomains in the cluster are responsible for the incorporation of sixacetyl-CoA moieties into the product. The loading module contains a KSQ,an AT and an ACP domain. KSQ refers to a domain that is homologous to aKS domain except that the active site cysteine (C) is replaced byglutamine (Q). There is no counterpart to the KSQ domain in the PKSclusters which have been previously characterized.

[0215] The ending domain (ED) is an enzyme which is responsible for theattachment of the nascent polyketide chain onto another molecule. Theamino acid sequence of ED resembles an enzyme, HetM, which is involvedin Anabaena heterocyst formation. The homology between vep and HetMsuggests that the polypeptide encoded by the vep gene cluster maysynthesize a polyene-containing composition which is present in thespore coat or cell wall of its natural host, S. venezuelae.

Example 4 Preparation of a Vector Encoding a Saturated β-hydroxyhexanoylCoA Monomer or an Unsaturated β-hydroxyhexanoyl CoA Monomer

[0216] To provide a recombinant monomer synthase that generates asaturated β-hydroxyhexanoylCoA or unsaturated β-hydroxyhexanoylCoAmonomer, the linear correspondence between the genetic organization ofthe Type I macrolide PKS and the catalytic domain organization in themultifunctional proteins is assessed (Donadio et al., supra, 1991; Katzet al., Ann. Rev. Microbiol., 47, 875 (1993)). First, a DNA encoding aTE is added to the 3′ end of an ORFI of a Type I PKS, preferably the metORFI (FIG. 6) as recently described by Cortes et al. (Science, 268, 1487(1995)) in the erythromycin system. To ensure that the DNA encoding theTE is completely active, DNA encoding a linker region separating anormal ACP-TE region in a PKS, for example, the one found in met PKSORF5 (FIG. 5), will be incorporated into the DNA. The resulting vectorcan be introduced into a host cell and the TE activity, rate of releaseof the CoA product, and identity of the fatty acid chain determined.

[0217] The acyl chain that is most likely to be released is the CoAester, specifically the 3-hydroxy-4-methyl heptenoylCoA ester, since thefilly elongated chain is presumably released in this form prior tomacrolide cyclization. If the CoA form of the acyl chain is notobserved, then a gene encoding a CoA ligase will be cloned andco-expressed in the host cell to catalyze formation of the desiredintermediate.

[0218] There is clear precedent for release of the predicted prematuretermination products from mutant strains of macrolide-producingStreptomyces that produce intermediates in macrolide synthesis (Huber etal., Antimicrob. Agents Chemother., 34, 1535 (1990); Kinoshita et al.,J. Chem. Soc., Chem. Comm., 14, 943 (1988)). The structure of theseintermediates is consistent with the linear organization of functionaldomains in macrolide PKSs, particularly those related to eryA, tyl, andmet. Other known PKS gene clusters include, but are not limited to, thegene cluster encoding 6-methylsalicylic acid synthase (Beck et al., Eur.J. Biochem., 192, 487 (1990)), soraphen A (Schupp et al., J. Bacteriol.,177, 3673 (1995)), and sterigmatocystin (Yu et al., J. Bacteriol., 177,4792 (1995)).

[0219] Once the release of the 3-hydroxy-4-methyl heptenoylCoA ester isestablished, DNA encoding the extender unit AT in met module 1 isreplaced to change the specificity from methylnalonylCoA to malonylCoA(FIGS. 4-6). This change eliminates methyl group branching in theβ-hydroxy acyl chain. While comparison of known AT amino acid sequencesshows high overall amino acid sequence conservation, distinct regionsare readily apparent where significant deletions or insertions haveoccurred. For example, comparison of malonyl and methylmalonyl aminoacid sequences reveals a 37 amino acid deletion in the central region ofthe malonyltransferase. Thus, to change the specificity of themethylmalonyl transferase to malonyl transferase, the met ORFI DNAencoding the 37 amino acid sequence of MMT will be deleted, and theresulting gene will be tested in a host cell for production of thedesmethyl species, 3-hydroxyheptenoylCoA. Alternatively, the DNAencoding the entire MMT can be replaced with a DNA encoding an intact MTto affect the desired chain construction.

[0220] After replacing MMT with MT, DNA encoding DH/ER will beintroduced into DNA encoding met ORFI module 1. This modificationresults in a multifunctional protein that generates a methylene group atC-3 of the acyl chain (FIG. 6). The DNA encoding DH/ER will be PCRamplified from the available eryA or tyl PKS sequences, including theDNA encoding the required linker regions, employing a primer pair toconserved sequences 5′ and 3′ of the DNA encoding DH/ER. The PCRfragment will then be cloned into the met ORFI. The result is a DNAencoding a multifunctional protein (MT* DH/ER*TE*). This proteinpossesses the full complement of keto group processing steps and resultsin the production of heptenoylCoA.

[0221] The DNA encoding dehydrase in met module 2 is then inactivated,using site-directed mutagenesis in a scheme similar to that used togenerate the rat FAS DH- described above (Joshi et al., J. Biol. Chem.,268, 22508 (1993)). This preserves the required (R)-3-hydroxy groupwhich serves as the substrate for PHA synthases and results in(R)-3-hydroxyheptanoylCoA species.

[0222] The final domain replacement will involve the DNA encoding thestarter unit acyltransferase in met module 1 (FIG. 5), to change thespecificity from propionyl CoA to acetyl CoA. This shortens the(R)-3-hydroxy acyl chain from heptanoyl to hexanoyl. The DNA encodingthe catalytic domain will need to be generated based on a FAS or6-methylsalicylic acid synthase model (Beck et al., Eur,. J. Biochem.,192, 487 (1990)) or by using site-directed mutagenesis to alter thespecificity of the resident met PKS propionyltransferase sequence.Limiting the initiator species to acetylCoA can result in the use ofthis starter unit by the monomer synthase. Previous work with macrolidesynthases have shown that some are able to accept a wide range ofstarter unit carboxylic acids. This is particularly well documented foravermectin synthase, where over 60 new compounds have been produced byaltering the starter unit substrate in precursor feeding studies (Duttonet al., J. Antibiotics, 44, 357 (1991)).

Example 5 Preparation of a Vector Encoding a Recombinant MonomerSynthase that Synthesizes 3-hydroxy-4-hexenoic Acid

[0223] To provide a recombinant monomer synthase that synthesizes3-hydroxyl-4-hexenoic acid, a precursor for polyhydroxyhexenoate, theDNA segment encoding the loading and the first module of the vep genecluster was linked to the DNA segment encoding module 7 of the tyl genecluster so as to yield a recombinant DNA molecule encoding a fusionpolypeptide which has no amino acid differences relative to thecorresponding amino acid sequence of the parent modules. The fusionpolypeptide catalyzes the synthesis of 3-hydroxyl-4-hexenoic acid. Therecombinant DNA molecule was introduced into SCP2, a Streptomycesvector, under the control of the act promoter (pDHS502, FIG. 20). Apolyhydroxyalkanoate polymerase gene, phacI from Pseudomonas oleavorans,was then introduced downstream of the recombinant PKS cluster (pDHS505;FIGS. 22 and 23). The DNA segment encoding the polyhydroxyalkanoatepolymerase is linked to the DNA segment encoding the recombinant PKSsynthase so as to yield a fusion polypeptide which synthesizespolyhydroxyhexenoate in Streptomyces. Polyhydroxyhexenoate, abiodegradable thermoplastic, is not naturally synthesized inStreptomyces, or as a major product in any other organism. Moreover, theunsaturated double bond in the side chain of polyhydroxyhexenoate mayresult in a polymer which has superior physical properties as abiodegradable thermoplastic over the known polyhydroxyalkanoates.

Example 6 Deletion of the desR Gene of the Desosamine Biosynthetic GeneCluster

[0224] As some macrolides have more than one attached sugar moiety, theassignment of sugar biosynthetic genes to the appropriate sugarbiosynthetic pathway can be quite difficult. Since methymycin (acompound of formula (1)) and neomethymycin (a compound of formula (2))(FIG. 24) (Donin et al., 1953; Djerassi et al., 1956), two closelyrelated macrolide antibiotics produced by Streptomyces venezuelae,contain desosamine as their sole sugar component, the organization ofthe sugar biosynthetic genes in the methymycin/neomethymycin genecluster may be less complicated. Thus, this system was chosen for thestudy of the biosynthesis of desosamine, aNN-dimethylamino-3,4,6-trideoxyhexose, which also exists in theerythromycin structure (Flinn et al., 1954).

[0225] To study the formation of this unusual sugar, a DNA library wasconstructed by partially digesting the genomic DNA of S. venezuelae(ATCC 15439) with Sau3A I into 35-40 kb fragments which were ligatedinto the cosmid vector pNJ1 (Tuan et al., 1990). The recombinant DNA waspackaged into bacteriophage λ which was used to transfect E. coli DH5α.The resulting cosmid library was screened for desired clones using thetylA1 and tylA2 genes from the tylosin biosynthetic cluster as probes(Baltz et al., 1988; Merson-Davies et al., 1994). These two probes arespecific for sugar biosynthetic genes whose products catalyze the firsttwo steps universally followed by all unusual 6-deoxyhexoses studiedthus far. The initial reaction involves conversion ofglucose-l-phosphate to TDP-D-glucose by α-D-glucose-1-phosphatethymidylyltransferase (TylA1) and subsequently, TDP-D-glucose istransformed to TDP-4-keto-6-deoxy-D-glucose by TDP-D-glucose4,6-dehydratase (TylA2). Three cosmids were found to contain geneshomologous to tylA1 and tylA2. Further analysis of these cosmids led tothe identification of nine open reading frames (ORFs) downstream of thePKS genes (FIG. 24). Based on sequence similarities to other sugarbiosynthetic genes, especially those derived form the erythromycincluster (Gaisser et al., 1997; Summers et al., 1997), eight of thesenine ORFs are believed to be involved in the biosynthesis ofTDP-D-desosamine. Interestingly, the ery cluster lacks homologs of thetylA1 and tylA2 genes that are responsible for the first two steps indesosamine pathway. It is possible that the erythromycin biosyntheticmachinery may rely on a general cellular pool ofTDP-4-keto-6-deoxy-D-glucose for mycarose and desosamine formation.Depicted in FIG. 24 is a biosynthetic pathway for TDP-D-desosamine.

[0226] Although eight of the nine ORFs have been assigned to desosamineformation, the presence of desR, which shows strong sequence homology toβ-glucosidases (as high as 39% identity and 46% similarity) (Castle etal., 1998), within the desosamine gene cluster is puzzling. Toinvestigate the function of DesR relative to the biosynthesis ofmethymycin/neomethymycin, a disruption plasmid (pBL1005) derived frompKC1139 (containing an apramycin resistance marker) (Bierman et al.,1992) was constructed in which a 1.0 kb NcoI/XhoI fragment of the desRgene was deleted and replaced by the thiostrepton Iresistance (tsr) gene(1.1 kb) (Bibb et al., 1985) via blunt-end ligation. This plasmid wasused to transform E. coli S17-1, which serves as the donor strain tointroduce the pBL1005 construct through conjugal transfer into thewild-type S. venezuelae (Biertnan et al., 1992). The double crossovermutants in which chromosomal desR had been replaced with the disruptedgene were selected according to their thiostrepton-resistant andapramycin-sensitive characteristics. Southern blot hybridizationanalysis was used to confirm the gene replacement.

[0227] The desired mutant was first grown at 29° C. in seed medium for48 hours, and then inoculated and grown in vegetative medium for another48 hours (Cane et al., 1993). After the fermentation broth wascentrifuiged at 10,000 g to remove cellular debris and mycelia, thesupernatant was adjusted to pH 9.5 with concentrated KOH, and extractedwith an equivolume of chloroform (four times). The organic layer wasdried over sodium sulfate and evaporated to dryness. The amber oil-likecrude products were first subjected to flash chromatography on silicagel using a gradient of 0-40% methanol in chloroform, followed by HPLCpurification on a C₁₈ column eluted isocratically with 45% acetonitrilein 57 mM ammonium acetate (pH 6.7). In addition to methymycin (acompound of formula (1)) and neomethymycin (a compound of formula (2)),two new products were isolated. The yield of a compound of formula (13)and a compound of formula (14) was each in the range of 5-10 mg/L offermentation broth. However, a compound of formula (1) and a compound offormula (2) remained to be the major products. High-resolution FAB-MSrevealed that both compounds have identical molecular compositions thatdiffer from methymycin/neomethymycin by an extra hexose. The chemicalnature of these two new compounds were elucidated to be C-2′β-glucosylated methymycin and neomethymycin (a compound of formula (13)and formula (14), respectively) by extensive spectral analysis.

[0228] The spectral data of (13): ¹H NMR (acetone-d₆) δ 6.56 (1H, d,J=16.0, 9-H), 6.46 (1H, d, J=16.0, 8-H), 4.67 (1H, dd, J=10.8, 2.0,11-H), 4.39 (1H, d, J=7.5, 1′H), 4.32 (1H, d, J=8.0, 1″-H), 3.99 (1H,dd, J=11.5, 2.5, 6″-H), 3.72 (1H, dd, J=11.5, 5.5, 6″-H), 3.56 (1H, m,5′-H), 3.52 (1H, d, J=10.0, 3-H), 3.37 (1H, t, J=8.5, 3″-H), 3.33 (1H,m, 5″-H), 3.28 (1H, t, J=8.5, 4″-H), 3.23 (1H, dd, J=10.5, 7.5, 2′-H),3.15 (1H, dd, J=8.5, 8.0, 2″-H), 3.10 (1H, m, 2-H), 2.75 (1H, 3′-H,buried under H₂O peak), 2.42 (1H, m, 6-H), 2.28 (6H, s, NMe₂), 1.95 (1H,m, 12-H), 1.9 (1H, m, 5-H), 1.82 (1H, m, 4′-H), 1.50 (1H, m, 12-H), 1.44(3H, d, J=7.0, 2-Me), 1.4 (1H, m, 5-H), 1.34 (3H, s, 10-Me), 1.3 (1H, m,4-H), 1.25 (1H, m, 4′-H), 1.20 (3H, d, J=6.0, 5′-Me), 1.15 (3H, d,J=7.0, 6-Me), 0.95 (3H, d, J=6.0, 4-Me), 0.86 (3H, t, J=7.5, 12-Me).High-resolution FAB-MS: calc for C₃₁H₅₄NO₁₂ (M+H)⁺632.3646, found632.3686.

[0229] Spectral data of (14): ¹H NMR (acetone-d₆) δ 6.69 (1H, dd,J=16.0, 5.5 Hz, 9-H), 6.55 (1H, dd, J=16.0, 1.3, 8-H), 4.71 (1H, dd,J=9.0, 2.0, 11-H), 4.37 (1H, d, J=7.0, 1′-H), 4.31 (1H, d, J=8.0, 1″-H),3.97 (1H, dd, J=11.5, 2.5, 6″-H), 3.81 (1H, dq, J=9.0, 6.0, 12-H), 3.72(1H, dd, J=11.5, 5.0, 6″-H), 3.56 (1H, m, 5′-H), 3.50 (1H, bd, J=10.0,3-H), 3.36 (1H, t, J=8.5, 3″-H), 3.32 (1H, m, 5″-H), 3.30 (1H, t, J=8.5,4″-H), 3.23 (1-H, dd, J=10.2, 7.0, 2′-H), 3.13, (1H, dd, J=8.5, 8.0,2″-H), 3.09 (1H, m, 2-H), 3.08 (1H, m, 10-H), 2.77 (1H, ddd, J=12.5,10.2, 4.5, 3′-H), 2.41 (1H, m, 6-H), 2.28 (6H, s, NMe₂), 1.89 (1H, t,J=13.0, 5-H), 1.83 (1H, ddd, J=12.5, 4.5, 1.5, 4′-H), 1.41 (3H, d,J=7.0, 2-Me), 1.3 (1H, m, 4-H), 1.25 (1H, m, 5-H), 1.2 (1H, m 4′-H, 1.20(3H, d, J=6.0 5′-Me), 1.17 (6H, d, J=7.0, 6-Me, 10-Me), 1.12 (3H, d,J=6.0, 12-me), 0.96 (3H, d, J=6.0, 4-Me). ¹³C NMR (acetone-d₆) δ 204.1(C-7), 175.8 (C-1), 148.2 (C-9), 126.7 (C-8), 108.3 (C-1″), 104.2 (C-1′)85.1 (C-3), 83.0 (C-2′), 78.2 (C-3″), 78.1 (C-5″), 76.6 (C-2″), 76.4(C-11), 71.8 (C-4″), 69.3 (C-5′), 66.1 (C-12), 66.0 (C-3′), 63.7 (C-6″),46.2 (C-6), 44.4 (C-2), 40.8 (NMe₂), 36.4 (C-10), 34.7 (C-5), 34.0(C-4), 29.5 (C-4′), 21.5 (5′-Me), 21.5 (12-Me), 17.9 (6-Me), 17.7(4-Me), 17.2 (2-Me), 9.9 (10-Me). High-resolution FAB-MS: calc forC₃₁H₅₄NO₁₂ (M+H)⁺632.3646, found 632.3648.

[0230] The coupling constant (d, J=8.0 Hz) of the anomeric hydrogen(1″-H) of the added glucose and the magnitude of the downfield shift(11.8 ppm) of C-2′of desosamine are all consistent with the assignedC-2′β-configuration (Seo et al., 1978).

[0231] The antibiotic activity of a compound of formula (13) and (14)against Streptococcus pyogenes was examined by separately applying 20 μLof each sample (1.6 mM in MeOH) to sterilized filter paper discs whichwere placed onto the surface of S. pyogenes grown on Mueller-Hinton agarplates (Mangahas, 1996). After being grown overnight at 37° C., theplates of the controls (a compound of formula (1) and (2)) showedclearly visible inhibition zones. In contrast, no such clearings werediscernible around the discs of a compound of formula (13) and (14).Evidently, β-glucosylation at C-2′of desosamine inmethymycin/neomethymycin renders these antibiotics inactive.

[0232] It should be noted that similar phenomena involving inactivationof macrolide antibiotics by glycosylation are known (Celmer et al.,1985; Kuo et al., 1989; Sasaki et al., 1996). For example, it was foundthat when erythromycin was given to Streptomyces lividans, whichcontains a macrolide glycosyltransferase (MgtA), the bacterium was ableto defend itself by glycosylating the drug (Cundliffe, 1992; Jenkins etal., 1991). Such a macrolide glycosyltransferase activity has beendetected in 15 out of a total of 32 actinomycete strains producingvarious polyketide antibiotics (Sasaki et al., 1996). Interestingly, theco-existence of a macrolide glycosyltransferase (OleD) capable ofdeactivating oleandomycin by glucosylation (Hernandez et al., 1993), andan extracellular β-glucosidase capable of removing the added glucosefrom the deactivated oleandomycin in Streptomyces antibioticus (Vilcheset al., 1992) has led to the speculation of glycosylation as a possibleself-resistance mechanism in S. antibioticus. Although the genes of theaforementioned glycosyltransferases have been cloned in a few cases,such as mgtA of S. lividans and oleD of S. antibioticus, the whereaboutsof macrolide β-glycosidase genes remain obscure. Interestingly, therecently released eryBI sequence, which is part of the erythromycinbiosynthetic cluster, is highly homologous to desR (55% identity)(Gaisser et al., 1997).

[0233] The discovery of desR, a macrolide β-glucosidase gene, within thedesosamine gene cluster is thus significant, and the accumulation ofdeactivated compounds of formula (13) and (14) after desR disruptionprovides direct molecular evidence indicating that a similarself-defense mechanism via glycosylation/deglycosylation may also beoperative in S. venezuelae. However, because a significant amount ofmethymycin and neomethymycin also exist in the fermentation broth of themutant strain, glucosylation of desosamine may not be the primaryself-resistance mechanism in S. venezuelae. Indeed, an rRNAmethyltransferase gene found upstream from the PKS genes in this clustermay confer the primary self-resistance protection. Thus, these resultsare consistent with the fact that antibiotic producing organismsgenerally have more than one defensive option (Cundliffe, 1989). Inlight of this observation, it is conceivable thatmethymycin/neomethymycin may be produced in part as the inertdiglycosides (a compound of formula (13) or (14)), and the macrolideβ-glucosidase encoded by desR is responsible for transformingmethymycin/neomethymycin from their dormant state to their active form.Supporting this idea, the translated desR gene has a leader sequencecharacteristic of secretory proteins (von Heijne, 1986; von Heijne,1989). Thus, DesR may be transported through the cell membrane andhydrolyze the modified antibiotics extracellularly to activate them(FIG. 25).

[0234] Summary

[0235] Inspired by the complex assembly and the enzymology of aminodeoxysugars that are frequently found as essential components of macrolideantibiotics, the entire desosamine biosynthetic gene cluster from themethymycin and neomethymycin producing strain Streptomyces venezuelaewas cloned, sequenced, and mapped. Eight of the nine mapped genes wereassigned to the biosynthesis of TDP-D-desosamine based on sequencesimilarities to those derived from the erythromycin cluster. Theremaining gene, designated desR, showed strong sequence homology toβ-glucosidases.

[0236] To investigate the function of the encoded protein (DesR), adisruption mutant was constructed in which a NcoI/XhoI fragment of thedesR gene was deleted and replaced by the thiostrepton resistance (tsr)gene. In addition to methymycin and neomethymycin, two new products wereisolated from the fermentation of the mutant strain. These two newcompounds, which are biologically inactive, were found to be C-2′β-glucosylated methymycin and neomethymycin. Since the translated desRgene has a leader sequence characteristic of secretory proteins, theDesR protein may be an extracellular β-glucosidase capable of removingthe added glucose from the modified antibiotics to activate them. Thus,the occurrence of desR within the desosamine gene cluster and theaccumulation of deactivated glucosylated methymycin/neomethymycin upondisruption of desR provide strong molecular evidence suggesting that aself-resistance mechanism via glucosylation may be operative in S.venezuelae.

[0237] Thus, the desR gene can be used as a probe to identify homologsin other antibiotic biosynthetic pathways. Deletion of the correspondingmacrolide glycosidase gene in other antibiotic biosynthetic pathways maylead to the accumulation of the glycosylated products which may be usedas prodrugs with reduced cytotoxicity. Glycosylation also holds promiseas a tool to regulate and/or minimize the potential toxicity associatedwith new macrolide antibiotics produced by genetically engineeredmicroorganisms. Moreover, the availability of macrolide glycosidases,which can be used for the activation of newly formed antibiotics thathave been deliberately deactivated by engineered glycosyltransferases,may be useful in the development of novel antibiotics using thecombinatorial biosynthetic approach (Hopwood et al., 1990; Katz et al.,1993; Hutchinson et al., 1995; Carreras et al., 1997; Kramer et al.,1996; Khosla et al., 1996; Jacobsen et al., 1997; Marsden et al., 1998).

Example 7 Deletion of the desVI Gene of the Desosamine Biosynthetic GeneCluster

[0238] The emergence of pathogenic bacteria resistant to many commonlyused antibiotics poses a serious threat to human health and has been theimpetus of the present resurgent search for new antimicrobial agents(Box et al., 1997; Davies, 1996; Service, 1995). Since the first reporton using genetic engineering techniques to create “hybrid” polyketides(Hopwood et al., 1995), the potential of manipulating the genesgoverning the biosynthesis of secondary metabolites to create newbioactive compounds, especially macrolide antibiotics, has received muchattention (Kramer et al., 1996; Khosla et al., 1996). This class ofclinically important drugs consists of two essential structuralcomponents: a polyketide aglycone and the appended deoxy sugars (Omura,1984). The aglycone is synthesized via sequential condensations of acylthioesters catalyzed by a highly organized multi-enzyme complex,polyketide synthase (PKS) (Hopwood et al., 1990; Katz, 1993; Hutchinsonet al., 1995; Carreras et al., 1997). Recent advances in theunderstanding of the polyketide biosynthesis have allowed recombinationof the PKS genes to construct an impressive array of novel skeletons(Kramer et al., 1996; Khosla et al., 1996; Hopwood et al., 1990; Katz,1993; Hutchinson et al., 1995; Carreras et al., 1997; Epp et al., 1989;Donadio et al., 1993; Arisawa et al., 1994; Jacobsen et al., 1997;Marsden et al., 1998). Without the sugar components, however, these newcompounds are usually biologically impotent. Hence, if one plans to makenew macrolide antibiotics by a combinatorial biosynthetic approach, twoimmediate challenges must be overcome: assembling a repertoire of novelsugar structures and then having the capacity to couple these sugars tothe structurally diverse macrolide aglycones.

[0239] Unfortunately, knowledge of the formation of the unusual sugarsin these antibiotics remains limited (Liu et al., 1994; Kirschning etal., 1997; Johnson et al., 1998). Part of the reason for this comes fromthe fact that the sugar genes are generally scattered at both ends ofthe PKS genes. Such an organization within the macrolide biosyntheticgene cluster makes it difficult to distinguish the sugar genes fromthose encoding regulatory proteins or aglycone modification enzymes thatare also interspersed in the same regions. The task can be made evenmore formidable if the macrolides contain multiple sugar components. Inview of the “scattered” nature of the sugar biosynthetic genes, theantibiotic methymycin (a compound of formula (1) in FIG. 24) and itsco-metabolite, neomethymycin (a compound of formula (2) in FIG. 24)), ofStreptomyces venezuelae present themselves as an attractive system tostudy the formation of deoxy sugars (Donin et al., 1953; Djerassi etal., 1956). First, they carry D-desosanime (a compound of formula (3)) aprototypical aminodeoxy sugar that also exists in erythromycin. Second,since desosamine is the only sugar attached to the macrolactone offormula (1) and (2), identification of the sugar biosynthetic geneswithin the methymycin/neomethymycin gene cluster should be possible withmuch more certainty.

[0240] A 10 kb stretch of DNA downstream from themethymycin/neomethymycin gene cluster, which is about 60 kb in length,was found to harbor the entire desosamine biosynthetic gene cluster(FIG. 26). Among the nine open reading frames (ORFs) mapped in thissegment, eight are likely to be involved in desosamine formation, whilethe remaining one, desR, encodes a macrolide β-glycosidase that may beinvolved in a self-resistance mechanism. Their identities, shown in FIG.26, are assigned based on sequence similarities to other sugarbiosynthetic genes (Gaisser et al., 1997; Summers et al., 1997). Theproposed pathway is well founded on literature precedent and mechanisticintuition for the construction of aminodeoxy sugars (Liu et al., 1994;Kirschning et al., 1997; Johnson et al., 1998).

[0241] To determine whether new methymycin/neomethymycin analoguescarrying modified sugars could be generated by altering the desosaminebiosynthetic genes, the desVI gene, which has been predicted to encodethe N-methyltransferase, was chosen as a target (Gaisser et al., 1997;Summers et al., 1997). The deduced desVI product is most closely relatedto that of eryCVI from the erythromycin producing strainSaccharopolyspora erythraea (70% identity), and also strongly resemblesthe predicted products of rdmD from the rhodomycin cluster ofStreptomyces purpurascens (Niemi et al., 1995), srmX from the spiromycincluster of Streptomyces ambofaciens (Geistlich et al., 1992), and tylM1from the tylosin cluster of Streptomyces fradiae (Gandecha et al.,1997). All of these enzymes contain the consensus sequence LLDV(I)ACGTG(SEQ ID NO:25) (Gaisser et al., 1997; Summers et al., 1997), near theirN-terminus, which is part of the S-adenosylmethionine binding site(Ingrosso et al., 1989; Haydock et al., 1991).

[0242] The deletion of desVI should have little polar effect (Lin etal., 1984) on the expression of other desosamine biosynthetic genesbecause the ORF (desR) lying immediately downstream from desVI is notdirectly involved in desosamine formation, and those lying furtherdownstream are transcribed in the opposite direction. Second, sinceN,N-dimenthylation is almost certainly the last step in the desosaminebiosynthetic pathway (Liu et al., 1994; Kirschning et al., 1997; Johnsonet al., 1998; Gaisser et al., 1997; Summers et al., 1997), perturbingthis step may lead to the accumulation of a compound of formula (4),which stands the best chance among all other intermediates of beingrecognized by the glycosyltransferase (DesVII) for successful linkage tothe macrolactone of formula (6) (FIG. 25). Deletion and/or disruption ofa single biosynthetic gene often affects the pathway at more than onespecific step. In fact, disruption of eryCVI, the desVI equivalent inthe erythromycin cluster, which has been predicted to encode a similarN-methylase to make desosamine in erythromycin (Gaisser et al., 1997;Summers et al., 1997), led to the accumulation of an intermediate devoidof the entire desosamine moiety (Summers et al., 1997).

[0243] A plasmid pBL3001, in which desVI was replaced by thethiostrepton gene (tsr) (Bibb et al., 1985), was constructed andintroduced into wild type S. venezuelae by conjugal transfer using E.coli S17-1 (Bierman et al., 1992). Two identical double crossovermutants, KdesVI-21 and KdesVI-22 with phenotypes of thiostreptonresistance (Thio^(R)) and apamycin sensitivity (Apm^(S)) were obtained.Southern blot hybridization using tsr or a 1.1 kb HincII fragment fromthe desVII region further confirmed that the desVI gene was indeedreplaced by tsr on the chromosome of these mutants. The KdesVI-21 mutantwas first grown at 29° C. in seed medium (100 mL) for 48 hours, and theninoculated and grown in vegetative medium (3 L) for another 48 hours(Cane et al., 1993). The fermentation broth was centrifuged to removethe cellular debris and mycelia, and the supernatant was adjusted to pH9.5 with concentrated KOH, followed by extraction with chloroform. Nomethymycin or neomethymycin was found; instead, the 10-deoxy-methynolide(6) (350 mg) (Lambalot et al., 1992) and two new macrolides containingan N-acetylated amino sugar, a compound of formula (7) (20 mg) and acompound of formula (8) (15 mg), were isolated. Their structures weredetermined by spectral analyses and high-resolution MS.

[0244] Spectral data of formula 7 are: ¹H NMR (CDCl₃) δ 6.62 (1H, d,J=16.0, H-9), 6.22 (1H, d, J=16.0, H-8), 5.75 (1 H, d, J=7.5, N-H), 4.75(1H, dd, J=10.8, 2.2, H-11), 4.28 (1H, d, J=7.5, H-1′), 3.95 (1H, m,H-3′), 3.64 (1H, d, J=10.5, H-3), 3.56 (1H, m, H-5′), 3.16 (1H, dd,J=10.0, 7.5, H-2′), 2.84 (1H, dq, J=10.5, 7.0, H-2), 2.55 (1H, m, H-6),2.02 (3H, s, NAc), 1.95 (1H, m, H-12), 1.90 (1H, m, H-4′), 1.66 (1H, m,H-5), 1.50 (1H, m, H-12), 1.41 (3H, d, J=7.0, 2-Me), 1.40 (1H, m, H-5),1.34 (3H, s, 10-Me), 1.25 (1H, m, H-4), 1.22 (1H, m, H-4′), 1.21 (3H, d,J=6.0, H-6′), 1.17 (3H, d, J=7.0, 6-Me), 1.01 (3H, d, J=6.5, 4-Me), 0.89(3H, t, J=7.2, 12-Me); ¹³C NMR (CDCl₃) δ 204.3 (C-7), 175.1 (C-1), 171.8(Me—C═O), 149.1 (C-9), 125.3 (C-8), 104.4 (C-1′), 85.4 (C-3), 76.3(C-11), 75.4 (C-2′), 74.1 (C-10), 68.6 (C-5′), 51.9 (C-3′), 45.0 (C-6),44.0 (C-2), 38.5 (C-4′), 33.8 (C-5), 33.3 (C-4), 23.1 (Me—C═O), 21.1(C-12), 20.6 (C-6′), 19.2 (10-Me), 17.5 (6-Me), 17.2 (4-Me), 16.2(2-Me), 10.6 (12-Me). High-resolution FABMS: calc for C₂₅H₄₃O₈N(M+H)⁺484.2910, found 484.2903.

[0245] Spectral data of formula 8 are: ¹H NMR (CDCl₃) 67 6.76 (1H, dd,J=16.0, 5.5, H-9), 6.44 (1H, dd, J=16.0, 1.5, H-8), 5.50 (1H, d, J=6.5,N-H), 4.80 (1H, dd, J=9.0, 2.0, H-11), 4.28 (1H, d, J=7.5, H-1′), 3.95(1H, m, H-3′), 3.88 (1H, m, H-12), 3.62 (1H, d, J=11.0, H-3), 3.57 (1H,m, H-5′), 3.18 (1H, dd, J=10.0, 7.5, H-2′), 3.06 (1H, m, H-10), 2.86(1H, dq, J=11.0, 7.0, H-2), 2.54 (1H, m, H-6), 2.04 (3H, s, NAc), 1.98(1H, m, H-4′), 1.67 (1H, m, H-5), 1.40 (1H, m, H-5), 1.39 (3H, d, J=7.0,2-Me), 1.25 (1H, m, H-4), 1.22 (1H, m, H-4′), 1.22 (3H, d, J=6.0, H-6′),1.21 (3H, d, J=6.0, 6-Me), 1.19 (3H, d, J=7.0, 12-Me), 1.16 (3H, d,J=6.5, 10-Me), 1.01 (3H, d, J=6.5, 4-Me); ″¹³C NMR (CDCl₃) δ 205.1(C-7), 174.6 (C-1), 171.9 (Me—C═O), 147.2 (C-9), 126.2 (C-8), 104.4(C-1′), 85.3 (C-3), 75.7 (C-11), 75.4 (C-2′), 68.7 (C-5′), 66.4 (C-12),52.0 (C-3′), 45.1 (C-6), 43.8 (C-2), 38.6 (C-4′) 35.4 (C-10), 34.1(C-5), 33.4 (C-4), 23.1 (Me—C═O), 21.0 (12-Me), 20.7 (C-6′), 17.7(6-Me), 17.4 (4-Me), 16.1 (2-Me), 9.8 (10-Me). High-resolution FABMS:calc for C₂₅H₄₃O₈N (M+H)⁺484.2910, found 484.2892.

[0246] The fact that compounds of formula (7) and (8) bearing modifieddesosamine are produced by the desVI-deletion mutant is a thrillingdiscovery. However, this result is also somewhat surprising since thesugar component in the products is expected to be the aminodeoxy hexose(4). As illustrated in FIG. 27, it is possible that a compound offormula (7) and (8) are derived from the predicted compound of formula(9) and (10), respectively, by a post-synthetic nonspecific acetylationof the attached aminodeoxy sugar. It is also conceivable thatN-acetylation of (4) occurs first, followed by coupling of the resultingsugar (11) to the 10-deoxymethynolide (6). Nevertheless, the lack ofN-methylation of the sugar component in these new products providesconvincing evidence sustaining the assignment of desVI as theN-methyltransferase gene. Most significantly, the production of acompound of formula (7) and (8) by the desVI-deletion mutant attests tothe fact that the glycosyltransferase (DesVII) inmethymycin/neomethymycin pathway is capable of recognizing andprocessing sugar substrates other than TDP-desosamine (5).

[0247] Since both compounds of formula (7) and (8) are new compoundssynthesized in vivo by the S. venezuelae mutant strain, the observedN-acetylation might be a necessary step for self-protection (Cundliffe,1989). In view of these results, the potential toxicity associated withnew macrolide antibiotics produced by genetically engineeredmicroorganisms can be minimized and newly formed antibiotics that havebeen deactivated (either deliberately or not) during production can beactivated. Such an approach can be part of an overall strategy for thedevelopment of novel antibiotics using the combinatorial biosyntheticapproach. Indeed, purified compounds of formula (7) and (8) are inactiveagainst Streptococcus pyogenes grown on Mueller-Hinton agar plates(Mangahas, 1996), while the controls (a compound of formula (1) and (2))show clearly visible inhibition zones.

[0248] It should be pointed out that a few glycosyltransferases involvedin the biosynthesis of antibiotics have been shown to have relaxedspecificity towards modified macrolactones (Jacobsen et al., 1997;Marsden et al., 1998; Weber et al., 1991). However, a similar relaxedspecificity toward sugar substrates has only been reported for thedaunorubicin glycosyltransferase, which is able to recognize a modifieddaunosamine and catalyze its coupling to the aglycone, ε-rhodomycinone(Madduri et al., 1998). Thus, the fact that the methymycin/neomethymycinglycosyltransferase can also tolerate structural variants of its sugarsubstrate indicates that at least some glycosyltransferases inantibiotic biosynthetic pathways may be useful to create biologicallyactive hybrid natural products via genetic engineering.

[0249] Summary

[0250] The appended sugars in macrolide antibiotics are indispensable tothe biological activities of these clinically important drugs.Therefore, the development of new antibiotics via a biologicalcombinatorial approach requires detailed knowledge of the biosynthesisof these unusual sugars, as well as the ability to manipulate thebiosynthetic genes to create novel sugars that can be incorporated intothe final macrolide structures. A targeted deletion of the desVI gene ofStreptomyces venezuelae, which has been predicted to encode anN-methyltransferase based on sequence comparison, was prepared todetermine whether new methymycin/neomethymycin analogues bearingmodified sugars can be generated by altering the desosamine biosyntheticgenes. Growth of the S. venezuelae deletion mutant strain resulted inthe accumulation of a methymycin/neomethymycin analogue carrying anN-acetylated aminodeoxy sugar. Isolation and characterization of thesederivatives not only provide the first direct evidence confirming theidentity of desVI as the N-methyltransferase gene, but also demonstratethe feasibility of preparing novel sugars by the gene deletion approach.Most significantly, the results also revealed that theglycosyltransferase of methymycin/neomethymycin exhibits a relaxedspecificity towards its sugar substrates.

Example 8 Cloning and Sequencing of the Met/Pik Biosynthetic GeneCluster Materials and Methods

[0251] Bacterial Strains and Media. E. coli DH5α was used as a cloninghost. E. coli LE392 was the host for a cosmid library derived from S.venezuelae genomic DNA. LB medium was used in E. coli propagation.Streptomyces venezuelae ATCC 15439 was obtained as a freeze-dried pelletfrom ATCC. Media for vegetative growth and antibiotic production wereused as described (Lambalot et al., 1992). Briefly, SGGP liquid mediumwas for propagation of S. venezuelae mycelia. Sporulation agar (SPA) wasused for production of S. venezuelae spores. Methymycin production wasconducted in either SCM or vegetative medium and pikromycin productionwas performed in Suzuki glucose-peptone medium.

[0252] Vectors, DNA Manipulation and Cosmid Library Construction. pUC119was the routine cloning vector, and pNJ1 was the cosmid vector used forgenomic DNA library construction. Plasmid vectors for gene disruptionwere either pGM160 (Muth et al., 1989) or pKC1139 (Bierman et al.,1992). Plasmid, cosmid, and genomic DNA preparation, restrictiondigestion, fragment isolation, and cloning were performed using standardprocedures (Sambrook et al., 1989; Hopwood et al., 1985). The cosmidlibrary was made according to instructions from the Packageneλ-packaging system (Promega).

[0253] DNA Sequencing and Analysis. An Exonuclease III (ExoIII) nesteddeletion series combined with PCR-based double stranded DNA sequencingwas employed to sequence the pik cluster. The ExolIl procedure followedthe Erase-a-Base protocol (Stratagene) and DNA sequencing reactions wereperformed using the Dye Primer Cycle Sequencing Ready Reaction Kit(Applied Biosystems). The nucleotide sequences were read from an ABIPRISM 377 sequencer on both DNA strands. DNA and deduced proteinsequence analyses were performed using GeneWorks and GCG sequenceanalysis package. All analyses were performed using the specific programdefault parameters.

[0254] Gene Disruption. A replicative plasmid-mediated homologousrecombination approach was developed to conduct gene disruption in S.venezuelae. Plasmids for insertional inactivation were constructed bycloning a kanamycin resistance marker into target genes, and plasmid forgene deletion/replacement was constructed by replacing the target genewith a kanamycin or thiostrepton resistance gene in the plasmid.Disruption plasmids were introduced into S. venezuelae by eitherPEG-mediated protoplast transformation (Hopwood et al., 1985) orRK2-mediated conjugation (Bierman et al., 1992). Then, spores fromindividual transformants or transconjugants were cultured onnon-selective plates to induce recombination. The cycle was repeatedthree times to enhance the opportunity for recombination. Doublecrossovers yielding targeted gene disruption mutants were selected andscreened using the appropriate combination of antibiotics and finallyconfirmed by Southern hybridization.

[0255] Antibiotic Extraction and Analysis. Methymycin, pikromycin, andrelated compounds were extracted following published procedures (Cane etal., 1993). Thin layer chromatography (TLC) was routinely used to detectmethymycin, neomethymycin, narbomycin and pikromycin. Furtherpurification was conducted using flash column chromatography and HPLC,and the purified compounds were analyzed by ¹H, ¹³C NMR spectroscopy andMS spectrometry.

[0256] Results

[0257] Cloning and Identification of the pik Cluster. Heterologoushybridization was used to identify genes for methymycin, neomethymycin,narbomycin and pikromycin biosynthesis in S. venezuelae. InitialSouthern blot hybridization analysis using a type I PKS DNA proberevealed two multifunctional PKS clusters of uncharacterized function inthe genome. Since these four antibiotics are all comprised of anidentical desosamine residue, a tylAI α-D-glucose-1-phosphatethymidylyltransferase DNA probe (for mycaminose/mycorose/mycinosebiosynthesis in the tylosin pathway) (Merson-Davies et al., 1994) wasused to locate the corresponding biosynthetic gene cluster(s). Thisanalysis established that only one of the PKS pathways contained acluster of desosamine biosynthetic genes. Nine overlapping cosmid cloneswere isolated spanning over 80 kilobases (kb) on the bacterialchromosome that encompassed the entire gene cluster (pik) formethymycin, neomethymycin, narbomycin and pikromycin biosynthesis (FIG.28). Through subsequent gene disruption, the other PKS cluster (vep,devoid of linked desosamine biosynthetic genes) was found to play norole in production of methymycin, neomethymycin, narbomycin orpikromycin.

[0258] Nucleotide Sequence of the pik Cluster. The nucleotide sequenceof the pik cluster was completely determined and shown to contain 18open reading frames (ORFs) that span approximately 60 kb. Central to thecluster are four large ORFs, pikAI, pikAII, pikAIII, and pikAIV,encoding a multifunctional PKS (FIG. 28). Analysis of the six modulescomprising the pik PKS indicated that it would specify production ofnarbonolide, the 14-membered ring aglycone precursor of narbomycin andpikromycin (FIG. 28).

[0259] Initial analysis unveiled two significant architecturaldifferences in the pikA-encoded PKS. First, compared with eryA (Donadioet al., 1998) and oleA (Swan et al., 1994), two PKS clusters thatproduce 14-membered ring macrolides erythromycin and oleadomycin similarto pikromycin, the presence of separate ORFs, pikAII and pikAIV,encoding Pik module 5 and Pik module 6 (as individual modules) asopposed to one bimodular protein as in eryAIII and oleAIII is striking.Secondly, the presence of a type II thioesterase immediately downstreamof the type I PKS cluster is also unprecedented (FIG. 28). These twocharacteristics suggest that pikA may produce the 12-membered ringmacrolactone 10-deoxymethynolide as well. Indeed, the domainorganization of PikAI-AIII (module L-5) is consistent with the predictedbiosynthesis of 10-deoxymethynolide except for the absence of a TEfunction at the C-terminus of Pik module 5 (PikAIII). The lack of a TEdomain in PikAIII may be compensated by the type II TE (encoded bypikAV) immediately downstream of pikaAIV. Consistent with thesupposition that two distinct polyketide ring systems are assembled fromthe pik PKS, two macrolide-lincosamide-streptogramin B type resistantgenes, pikR1 and pikR2, are found upstream of the pik PKS (FIG. 29),which presumably provide cellular self-protection for S. venezuelae.

[0260] The genetic locus for desosamine biosynthesis and glycosyltransfer are immediately downstream of pikA. Seven genes, desI, desII,desIII, desIV, desV, desVI, and desVIII, are responsible for thebiosynthesis of the deoxysugar, and the eighth gene, desVII, encodes aglycosyltransferase that apparently catalyzes transfer of desosamineonto the alternate (12- and 14-membered ring) polyketide aglycones. Theexistence of only one set of desosaimine genes indicates that DesVIIIcan accept both 10-deoxymethynolide and narbonolide as substrates(Jacobsen et al., 1997). The largest ORF in the des locus, desR, encodesa α-glycosidase that is involved in a drug inactivation-reactivationcycle for bacterial self-protection.

[0261] Just downstream of the des locus is a gene (pikC) encoding acytochrome P450 hydroxylase similar to eryF (Andersen et al., 1992), anderyK (Stassi et al., 1993), PikC, and a gene (pikD) encoding a putativeregulator protein, PikD (FIG. 28). Interestingly, PikC is the only P450hydroxylase identified in the entire pik cluster, suggesting that theenzyme can accept both 12- and 14-membered ring macrolide substratesand, more remarkably, it is active on both C-10 and C-12 of the YC-17(12-membered ring intermediate) to produce methymycin and neomethymycin(FIG. 30). PikD is a putative regulatory protein similar to ORFH in therapamycin gene cluster (Schwecke et al., 1995).

[0262] The combined functionality coded by the eighteen genes in the pikcluster predicts biosynthesis of methymycin, neomethymycin, narbomycinand pikromycin (Table 2). Flanking the pik cluster locus are genespresumably involved in primary metabolism and genes that may be involvedin both primary and secondary metabolism. An S-adenosyl-methioninesynthase gene is located downstream of pikD that may help to provide themethyl group in desosamine synthesis. A threonine dehydratase gene wasidentified upstream of pikR1 that may provide precursors for polyketidebiosynthesis. It is not apparent that any of these genes are dedicatedto antibiotic biosynthesis and they are not directly linked to the pikcluster. TABLE 2 Deduced function of ORFs in the pik cluster PolypeptideAmino Proposed function or (ORF) acids, no. sequence similarity detectedPikAI 4,613 PKS Loading KS^(Q) AT(P) ACP module Module 1 KS AT(P) KR ACPModule 2 KS AT(A) DH KR ACP PikAII 3,739 PKS Module 3 KS AT(P) KR⁰ ACPModule 4 KS AT(P) DH ER KR ACP PikAIII 1,562 PKS Module 5 KS AT(P) KRACP PikAIV 1,346 PKS Module 6 KS AT(P) ACP TE PikAV   281 ThioesteraseII (TEII) DesI   415 4-Dehydrase DesII   485 Reductase? DesIII   292α-D-Glucose-1-phosphate thymidylyltransferase DesIV   337 TDP-glucose4,6-dehydratase DesV   379 Transaminase DesVI   237N,N-dimethyltransferase DesVII   426 Glycosyl transferase DesVIII   402Tautomerase? DesR   809 β-Glucosidase (involved in resistance mechanism)PikC   418 P450 hydroxylase PikD   945? Putative regulator PikR1   336rRNA methyltransferase (mls resistance) PikR2   288? rRNAmethyltransferase (mls resistance)

[0263] TABLE 3 Summary of mutational analyses of the pik clusterAntibiotic production/ Type of Target Intermediate accumulation Mutantmutation gene Met & neomethymycin Pikromycin AX903 Insertion pikAI No/NoNo/No LZ3001 Deletion/ desVI No/10-deoxy- No/narbonolide replacementmethynolide LZ4001 Deletion/ desV No/10-deoxy- No/narbonolidereplacement methynolide AX905 Deletion/ pikAV <5%/No <5%/No replacementAX906 Insertion pikC No/YC-17 No/narbomycin

[0264] Mutational Analysis of the pik Cluster. Extensive disruption ofgenes in the pik cluster were carried out to address the role of keyenzymes in antibiotic production (Table 3). First, PikAI, the firstputative enzyme involved in the biosynthesis of 10-deoxymethynolide andnarbonolide was inactivated by insertional mutagenesis. The resultingmutant, AX903, produced neither methymycin or neomethymycin, nornarbomycin or pikromycin, indicating that pikA encodes a PKS requiredfor both 12- and 14-membered ring macrolactone formation.

[0265] Second, deletion of both desVI and desV abolished methymycin,neomethymycin, narbomycin and pikromycin production, and the resultingmutants, LZ3001 and LZ4001, accumulate 10-deoxymethynolide andnarbonolide in their culture broth, indicating that enzymes fordesosamine synthesis and transfer are also shared by the 12- and14-membered ring macrolides.

[0266] In order to understand the mechanism of polyketide chaintermination at PikAII (PIKAIII (module 5) is presumed to be thetermination point in construction of 10-deoxymethynolide), the pik TEIIgene, pikAV, was deleted. The deletion/replacement mutant, AX905,produces less than 5% of methymycin, neomethymycin, and less than 5% ofpikromycin compared to wild type S. venezuelae. This abrogation inproduct formation occurs without significant accumulation of theexpected aglycone intermediates, suggesting thatpik TEII is involved inthe termination of 12- as well as 14-membered ring macrolides at PikAIIIand PikAIV, respectively. Although the polar effects may influence theobserved phenotype in AX905, this has been ruled out after theconsideration of mutant LZ3001, in which mutation in an enzymedownstream of pikA V accumulated 10-deoxymethynolide and narbonolide.The fact that mutant AX905 failed to accumulate these intermediatessuggested that the polyketide chains were not efficiently released fromthis PKS protein in the absence of Pik TEII. Therefore, Pik TEII plays acrucial role in polyketide chain release and cyclization, and itpresumably provides the mechanism for alternative termination in pikpolyketide biosynthesis.

[0267] Finally, disruption of pikC confirmed that PikC is the soleenzyme catalyzing hydroxylation of both YC-17 (at C-10 and C-12) andnarbomycin (at C-12). The relaxed substrate specificity of PikC and itsregional specificity at C-10 and C-12 provide another layer ofmetabolite diversity in the pik-encoded biosynthetic system.

[0268] Discussion

[0269] The work described herein has established that methymycin,neomethymycin, narbomycin and pikromycin biosynthesis is encoded by thepik cluster in S. venezuelae. Three key enzymes as well as the uniquearchitecture of the cluster enable this relatively compact system toproduce multiple macrolide antibiotics. Foremost, the presence of pikmodule 5 and 6 as separate proteins, PikAIII and PikAIV, and theactivity of pik TEII enable the bacterium to terminate the polyketidechain at two different points of assembly, thereby producing twomacrolactones of different ring size. Second, DesVII, theglycosyltransferase in the pik cluster, can accept both 12- and14-membered ring macrolactones as substrates. Finally, PikC, the P450hydroxylase, has a remarkable substrate and regiochemical specificitythat introduces another layer of diversity into the system.

[0270] It is interesting to consider that pikA evolved in a lineanalogous to eryA and oleA since each of these PKSs specify thesynthesis of 14-membered ring macrolactones. Therefore, pik may haveacquired the capacity to generate methymycin when a mutation in theprimordial pikAIII-pikAIV linker region caused splitting of Pik module 5and 6 into two separate gene products. This notion is raised by twofeatures of the nucleotide sequence. First, the intergenic regionbetween pikAIII and pikAIV, which is 105 bp, may be the remanent of anintramodular linker peptide of 35 amino acids. Moreover, the potentialfor independently regulated expression of pikAIV is implied by thepresence of a 100 nucleotide region at the 5′ end of the gene that isrelatively AT-rich (62% as comparing 74% G+C content in coding region).Thus, as the mutation in an original ORF encoding the bimodularmultifunctional protein (PikAII-PikAIV) occurred, so too may haveevolved a mechanism for regulated synthesis of the new gene product(PikAIV).

[0271] The role of Pik TEII in alternative termination of polyketidechain elongation intermediates provides a unique aspect of diversitygeneration in natural product biosynthesis. Engineered polyketides ofdifferent chain length are typically generated by moving the TEcatalytic domain to alternate positions in a modular PKS (Cortes et al.,1995). Repositioning of the TE domain necessarily abolishes productionof the original full-length polyketide so only one macrolide is producedeach time. In contrast to the fixed-position TE domain, the independentPik TEII polypeptide presumably has the flexibility to catalyzetermination at different stages of polyketide assembly, thereforeenabling the system to produce multiple products of variant chainlength. Combinatorial biology technologies can now exploit this systemfor generating molecular diversity through construction of novel PKSsystems with TEIIs for simultaneous production of several new moleculesas opposed to the TE domains alone that limit catalysis to a singletermination step.

[0272] It is noteworthy that sequences similar to Pik TEII are found inalmost all known polyketide and non-ribosomal polypeptide biosyntheticsystems (Marahiel et al., 1997). Currently, the pik TEII is the first tobe characterized in a modular PKS. However, recent work on a TEII genein the lipopeptide surfactin biosynthetic cluster (Schneider et al.,1998) demonstrated that srf-TEII plays an important role in polypeptidechain release, and may suggest that srf-TEII reacts at multiple stagesin peptide assembly as well (Marahiel et al., 1997).

[0273] The enzymes involved in post-polyketide assembly of10-deoxymethynolide and narbonolide are particularly intriguing,especially the glycosyltransferase, DesVII, and P450 hydroxylase, PikC.Both have the remarkable ability to accept substrates with significantstructural variability. Moreover, disruption of desVI demonstrated thatDesVII also tolerates variations in deoxysugar structure (Example 6).Likewise, PikC has recently been shown to convert YC-17 tomethymycin/neomethymycin and narbomycin to pikromycin in vitro.

[0274] Targeted gene disruption of ORF1 abolished both pikromycin andmethymycin production, indicating that the single cluster is responsiblefor biosynthesis of both antibiotics. Deletion of the TE2 genesubstantially reduced methymycin and pikromycin production, whichdemonstrates that TE2, in contrast to the position-fixed TE1 domain, hasthe capacity to release polyketide chain at different points during theassembly process, thereby producing polyketides of different chainlength.

[0275] The results described above were unexpected in that it wassurprising that one PKS cluster produces two macrolides which differ inthe number of atoms in their ring structure, that module 5 and module 6of the PKS are in ORFs that are separated by a spacer region, thatPikAIII lacked TE, that there was a Type II thioesterase, that TEIdomain was not separate, and that 2 resistance genes were identifiedwhich may be specific for either a 12- or 14-membered ring.

[0276] With eighteen genes spanning less than 60 kb of DNA capable ofproducing four active macrolide antibiotics, the pik cluster representsthe least complex yet most versatile modular PKS system so farinvestigated. This simplicity provides the basis for a compellingexpression system in which novel active ketoside products are engineeredand produced with considerable facility for discovery of a diverse rangeof new biologically active compounds.

Summary

[0277] Complex polyketide synthesis follows a processive reactionmechanism, and each module within a PKS harbors a string of three to sixenzymatic domains that catalyze reactions in nearly linear order asdescribed in particular detail for the erythromycin-producing PKS (Katz,1997; Khosla, 1997; Staunton et al. 1997). The combined set of PKSmodules and catalytic domains along with genes that encode enzymes forpost-polyketide tailoring (e.g., glycosyl transferases, hydroxylases)typically limits a biosynthetic system to the generation of a singlepolyketide product.

[0278] Combinatorial biology involves the genetic manipulation ofmultistep biosynthetic pathways to create molecular diversity in naturalproducts for use in novel drug discovery. PKSs represent one of the mostamenable systems for combinatorial technologies because of theirinherent genetic organization and ability to produce polyketidemetabolites, a large group of natural products generated by bacteria(primarily actinomycetes and myxobacteria) and fungi with diversestructures and biological activities. Complex polyketides are producedby multifunctional PKSs involving a mechanism similar to long-chainfatty acid synthesis in animals (Hopwood et al., 1990). Pioneeringstudies (Cortes et al., 1990; Donadio et al., 1991) on the erythromycinPKS in Saccharopolyspora erythraea revealed a modular organization.Characterization of this multidomain protein system, followed bymolecular analysis of rapamycin (Aparicio et al., 1996), FK506 (Motamediet al., 1997), soraphen A (Schupp et al., 1995), niddamycin (Kakavas etal., 1997), and rifarnycin (August et al., 1998) PKSs, demonstrated aco-linear relationship between modular structure of a multifunctionalbacterial PKS and the structure of its polyketide product.

[0279] In a survey of microbial systems capable of generating unusualmetabolite structural variability, Streptomyces venezuelae ATCC 15439 isnotable in its ability to produce two distinct groups of macrolideantibiotics. Methymycin and neomethymycin are derived from the12-membered ring macrolactone 10-deoxymethynolide, while narbomycin andpikromycin are derived from the 14-membered ring macrolactone,narbonolide. The cloning and characterization of the biosynthetic genecluster for these antibiotics reveals the key role of a type IIthioesterase in forming a metabolic branch through which polyketides ofdifferent chain length are generated by the pikromycin multifunctionalpolyketide synthase (PKS). Immediately downstream of the PKS genes(pikA) are a set of genes for desosamine (des) biosynthesis andmacrolide ring hydroxylation. The glycosyl transferase (encoded bydesVIII) has the remarkable ability to catalyze glycosylation of boththe 12- and 14-membered ring macrolactones. Moreover, the pikC-encodedP450 hydroxylase provides yet another layer of structural variability byintroducing regiochemical diversity into the macrolide ring systems.

Example 9 Strategies Employing Modular PKS as PHA Monomer Providers

[0280] One strategy to exploit modular PKSs, e.g., modules of pikA or aFAS, to provide PHA monomers is to harvest polyketide intermediates asCoA derivatives using a TEII which is converted to an acyl-CoAtransferase (mTEII). PikTEII is a small enzyme (281 amino acids) encodedby pikAV in S. venezuelae. The primary function of the wild-type enzymeis to catalyze the release of a polyketide chain at the fifth module inthe pikA pathway as 10-deoxymethonolide. The enzyme most likely binds tothe fifth module (PikAIIl) ACP (ACP₅) and releases the acyl chainattached to it. This relationship, TEII and its cognate ACP₅, can beexploited to produce a polyketide having different chain lengths bymoving Pik ACP₅ to a different position in the cluster. For example, bymoving ACP₅ into the second module in place of ACP₂, a triketide insteadof hexoketide may be produced by the cluster. Further, moving KR5together with ACP₅ into the second module, and replacing the DH, KR, andACP domains, a 3-hydroxyl triketide is produced that is structurallysuitable as PHA monomer. A mutant TEII (mTEII) catalyzes the release ofthe triketide as CoA form. The triketide-CoA,3,5-dihydroxyl-4-methyl-heptonyl-CoA, is a substrate for PHA polymerase,e.g., PhaC1 from P. olivarus, which, in turn, can incorporate themonomer into a polymer.

[0281] A second strategy includes the harvesting of a polyketideintermediate as a CoA derivative using a TEI which has been converted toan acyl-CoA transferase (mTE). Thus, the second strategy for3-hydroxyacyl-CoA monomer production is to exploit the TE domain (TEI)within the PKS module. It has been demonstrated that the TE domain canrelease polyketide intermediates attached to the ACP domain within thesame module. Moving the TEI to a different position in a PKS clusterresults in the production of a polyketide having a different chainlength. Similarly, a mutant TEI (mTEI) (i.e., one which is an acyl-CoAtransferase) releases the polyketide intermediate to acyl-CoA, whichthen is polymerized by PHA synthetase. Preferably, a mutant TE domain inthe pikA4 gene cluster is moved into pik module 1, fusing it immediatelydownstream of ACP 1. The recombinant enzyme produces2-(S)-methyl-3(R)-hydroxylveleratyl-CoA, which is a suitable substratefor PHA polymerase PhaC1. Therefore, the coexpression of the polymerasewith the recombinant PKS produces a polymer.

[0282] A third strategy is to directly collect polyketide intermediatesas substrates for PHA synthesis by fusing a PHA polymerase with apolyketide synthase. The first two strategies produce 3-hydroxylacyl-CoAas a substrate for PHA synthesis by employing a mutant PKS enzyme (TEIor TEII). As PHA polymerase may be active on acyl-ACP itself if theacyl-ACP is properly oriented, the third strategy fuses a PHA polymerasedownstream of an ACP in a PKS protein. The PHA synthetase then serves asa domain within the chimeric multifunctional enzyme in place of a TEdomain. The PKS portion of the protein catalyzes the synthesis of a3-hydroxylacyl-ACP intermediate and then the PHA synthetase domainaccepts it as substrate and adds the 3-hydroxylacyl monomer to thegrowing polyhydroxyalkanoate chain. The process regenerates ACP functionso that the reaction can go on repeatedly to synthesize a PHA ofmultiple units. For example, a phaC1 gene is fused directly downstreamof pik ACP1 so as to produce a chimeric enzyme that catalyzes thesynthesis of a polymer.

[0283] The strategies described above can produce PHAs of complexstructure, and having superior properties. In addition, the structurecan be easily fine-tuned by modifying the PKS gene, thus resulting inPHAs having desired properties or functions.

Example 10 Control of Macrolactone Structure by Alternative Expressionof a Modular Polyketide Synthase

[0284] Material and Methods

[0285] Media. Streptomyces venezuelae ATCC 15439 produces two groups ofmacrolide antibiotics: the 12-membered ring macrolides methymycin andneomethymycin, and the 14-membered ring macrolides pikromycin andnarbomycin (FIG. 28). Methymycin and neomethymycin are derived from the12-membered ring macrolactone 10-deoxymethynolide and are produced inSCM medium (Lambalot et al., 1992), whereas pikromycin and narbomycinare derived from the 14-membered ring macrolactone narbonolide and areproduced in PGM medium (Xue et al., 1998).

[0286] Genetic Manipulation of S. venezuela. Mutant AX910 and AX912 werecreated by targeted gene replacement. The mutation plasmid pDHS910 wascreated by ligating two DNA fragments flanking the TE domain so that theTE domain was deleted and a hexa-histidine sequence was introduced atits position. The primer pairs that were used to amplified the flankingDNA in polymerase chain reaction (PCR) are5′-CCCGAATTCGCCGCCGCCATGGCCGAA-3′ (SEQ ID NO:42) and5′-GTGATGCATCGGCTCGGCGACGGCCCAGTTCCGCT-3′ (SEQ ID NO:43); and5′-ATGCATCACCACCACCACCACTGAGGGGGCGGGCAAGTGACCGAC-3′ (SEQ ID NO:44) and5′-GGGTCTAGAGCTGCACCGGCGGGTCGTAGCGGA-3′ (SEQ ID NO:45). Plasmid pDHS910was introduced into S. venezuelae AX905 (Xue et al., 1998) which has akanamycin resistance marker at the position of pikAV. Followingprocedures established by Xue et al. (1998), mutant AX910 (12 colonies)was isolated by screening for a kanamycin sensitive phenotype. Theexpected genotype of the mutant was confimned by genomic Southernhybridization. Mutation plasmid pDHS912 was generated by replacing aBamHI-BglII fragment (the DNA fragment corresponding to the pikA V geneimmediately downstream of the TE domain) in pDHS910 with a kanamycinresistance gene (Denis et al., 1992). Thus, the TE domain as well as theTEII gene pikAV were disrupted in the mutant AX912. Plasmid pDHS912 wastransferred into wild type S. venezuelae and mutant AX912 (12 colonies)was selected according to the procedures of Xue et al. (1998).

[0287] Western Blot Analysis. Western blot analysis of PikAIV followedstandard procedures (Sambrook et al., 1989). The total protein of S.venezuelae AX910, AX912, or wild type was first prepared from a four-dayculture in either SCM or PGM medium. The protein extract was separatedon a 10% SDS-PAGE, transferred to PVDF membrane (Bio-Rad, Hercules,Calif.), hybridized with anti-6xHis antibody (Qiagen, Valencia, Calif.),and visualized using a secondary antibody conjugated to alkalinephophatase (Sigma, St. Louis, Mo.).

[0288] Construction of Complementation Plasmids. The pikA promoter,PpikA, was isolated as an EcoRV-EcoRI fragment between pikAI and pikRIin the pik cluster (Xue et al., 1998). To create a plasmid forcomplementation, a DNA fragment encoding PikAV was first PCR-amplifiedand placed downstream of the EcoRI site in such a way that PikAV wastranslationally coupled to the leader sequence of pikAI in PpikA to giveplasmid pDHS702. Then, plasmids pDHS704, pDHS705, pDHS706, pDHS707, andpDHS708 were constructed by cloning various lengths of the pikAIV-pikAVregion into pDHS702 replacing pikAV. The various lengths of pikAIV werePCR-amplified from cosmid pLZ51 (Xue et al., 1998) by the followingprimer pairs: prepared with primers 5′ 5′-GAATTCATCGAGGGGGCGGGCAAGTGA-3′(SEQ ID NO:46) and 5′-ATGCATCAGGTCGTCGGTCACCGTGGGTTCT- (SEQ ID NO:47)3′ for pDHS7O2; 5′-GGATCCGCGCCGGGATGTTCCGCGCCCTGT-3′ (SEQ ID NO:48) and5′-AAAATGCATCAGAGGTCTGTCGGTCACTTGC- (SEQ ID NO:49) 3′, for pDHS704;5′-AAAAGATCTTGATGGTGCAGGCGCTGCGCCACG (SEQ ID NO:50) GGGTGCTG-3′ and5′-AAAATGCATCAGAGGTCTGTCGGTCACTTGC- (SEQ ID NO:49) 3′ for pDHS708; and5′-AAAAGATCTCCAACGAACAGTTGGTGGACGCT- (SEQ ID NO:51) 3′ and5′-AAAATGCATCAGAGGTCTGTCGGTCACTTGC- (SEQ ID NO:49) 3′ for pDHS707.

[0289] The fragment in pDHS705 (EcoRI-BamH1) and pDHS706 (EcoRI-BglII)was isolated directly from restriction digestion of cosmid pLZ51 (Xue etal., 1998) and ligated into EcoRI-BglII treated pDHS702.

[0290] Antibiotic Extraction and Identification. Extraction,identification, and quantitation of methymycin and related compoundsfollowed a procedure developed by Cane et al. (1993), which issummarized in Xue et al. (1998).

[0291] Results and Discussion

[0292] Deletion of the TE Domain from PikAIV. Production of both10-deoxymethynolide and narbonolide is mediated by a single PKS cluster(pikA) in S. venezuelae (Xue et al., 1998). The pikA-encoded PKS iscomposed of PikAI, PikAII, PikAIII, and PikAIV (FIG. 28) multifunctionalproteins similar to EryAI-AIII except that PikAIII and PikAIV eachcontain a single module in contrast to the bimodular EryAIII (Donadio etal., 1991). Moreover, PikAV is an independent thioesterase (TEII) thatis distinct from the thioesterase domain (TE) located at the C-terminusof PikAIV. The modular organization of PikA indicates that PikAI-PikAIIIproduces a hexaketide that cyclizes into 10-deoxymethynolide, and thatPikAI-PikAIV produces a heptaketide that cyclizes into narbonolide (FIG.28). Termination of polyketide assembly at the heptaketide stage islikely catalyzed by the C-terminal TE domain in PikAIV, which isanalogous to chain termination in the erythromycin pathway. However, itwas not clear how the PikA system terminates polyketide assembly toproduce the 12-membered ring aglycone, 10-deoxymethynolide. Geneticevidence excluded PikAV (TEH) as the determining factor in alternativetermination since deletion of pikAV reduced the production of bothmacrolactones (Xue et al, 1998).

[0293] To study the role of PikAIV in alternative termination, twomutant strains of S. venezuelae were created in which PikAIV wasdisrupted by deleting the C-terminal thioesterase (TE) domain. In mutantAX910, an inframe deletion was engineered to remove the TE domain fromS. venezuelae chromosome. In a second mutant, AX912, the TE domain aswell as the downstream TEII gene (pikAV) was removed from the bacterialchromosome. As expected, S. venezuelae AX912 is devoid of antibioticproduction since the mutant lacks the thioesterase activities that arenecessary to release the polyketide chain from the Pik PKS protein. Itwas expected that the AX910 mutant strain would at least produce the12-membered ring macrolides methymycin and neomethymycin because thesixth condensation cycle catalyzed by PikAIV is not required for10-deoxymethynolide formation. Surprisingly, mutant AX910 produced traceamounts of pikromycin, however, methymycin and neomethymycin werecompletely absent from the fermentation broth. Since the AX910 mutantcontains an inframe deletion of the pikAIV-encoded TE domain, thepotential for a downstream polar effect (on the pikAV-encoded TEIIenzyme) was avoided. This result suggested that PikAIV, or at least theTE domain within PikAIV, is involved directly in the production of the12- as well as 14-membered ring macrolactones.

[0294] Probing the expression of PikAIV. To investigate the differentialexpression of pikAIV using culture conditions for methymycin (SCMmedium) or pikromycin (PGM medium) production, the PikAIV protein wasfirst tagged by a hexa-histidine sequence replacing the TE domain at itsC-terminus. Expression of PikAIV was then probed with anti-6×Hisantibody in a Western blot that revealed a single protein band underconditions for either methymycin or pikromycin production in the mutantstrains (AX910 and AX912). Interestingly, the protein detected from cellextracts obtained under culture conditions for methymycin production(SCM medium) was approximately 25 kDa lower in molecular weight comparedto the protein detected under conditions for pikromycin production (PGMmedium). The molecular weight of the protein detected under pikromycinculture conditions is 110 kDa, which is consistent with the predictedTE-truncated (6×His-tag replaced) form of PikAIV. Therefore, the proteindetected under conditions for methymycin production must be anN-terminal truncated form of PikAIV (FIG. 41). Indeed, two potentialalternative translation start sites have been located in the pikAIVsequence, with either predicted to generate the truncated form ofPikAIV. The presumed alternative expression of pikAIV creates a proteinproduct that contains only half of the Pik module 6 KS (KS₆) domain(FIG. 41). This result immediately pointed to a mechanism foralternative termination in the PikA system. Since the KS₆ domain isresponsible for the condensation of the final extender unit, a PKS thatis unable to catalyze this reaction could only produce the 12-memberedring macro lactone.

[0295] Complementation analysis of PikAIV. To investigate thefunctioning of the truncated form of PikAIV, the contribution of variousdomains in the multifunctional protein was tested by geneticcomplementation of S. venezuelae mutant strain AX912. An SCP2* -basedlow copy number plasmid (Lydiate et al., 1985) was designed and thetarget gene (comprised of alternative-length forms of pikAIV) was placedunder the control of the native pikA promoter (Xue et al., 1998). Usingthis system, the expression of pikAIV from the plasmid would mostclosely resemble its normal temporal expression profile, and would alsobe synchronized with expression ofthe pikA cluster encoded on the S.venezuelae chromosome. This system was used to test the ability ofalternative forms of the pikAIV-pikAV region (FIG. 41) to complement theTE-TEII double mutant strain AX912.

[0296] The results clearly demonstrated that the TE domain in PikAIV iscritical for 10-deoxymethynolide formation. Specifically, all of theplasmid constructs that contain the TE domain including, pDHS704 (TEalone), pDHS705 (ACP₆-TE), pDHS706 (ACP₆-TE::TEII), pDHS708(AT₆-ACP₆-TE), and pDHS707 (KS₆-AT₆-ACP₆-TE), complemented mutant AX912to give 10-deoxymethynolide. Interestingly, other domains in thetruncated form of PikAIV, especially the AT domain, were necessary foreffective production of 10-deoxymethynolide. The most efficientproduction of 10-deoxymethynolide resulted from complementation bypDHS708 (AT₆-ACP₆-TE), which contains the AT domain and closely mimicsthe truncated form of PikAIV detected in wild type S. venezuelae underconditions for methymycin production (FIG. 41). The relatively efficientcomplementation by the TE domain alone (pDHS704) leading to10-deoxymethynolide is especially intriguing and may result from twopossible (or one of the two) complementation scenarios. Specifically, itmay involve interaction of the TE domain directly with PikAIII (FIG.42C) and/or formation of a wild type-like PKS complex (FIG. 42B) by theTE domain expressed from the plasmid interacting with the rest of PikAIV(expressed from the corresponding AX912 chromosomal allele) throughnoncovalent interactions.

[0297] Interestingly, the TE domain alone did not complement AX912(TE-TEII double mutant) to give narbonolide production (FIG. 41). Thisis consistent with a recent result (Gokhale et al., 1999) obtained fromthe erythromycin PKS system suggesting that the TE domain may notinteract significantly with it natural endogenous module (e.g., EryAIIIor PikAIV) but must be covalently linked to be functional. However, thefailure to complement may be due in part to introduction of thehexa-histidine at the C-terminus of the engineered PikAIV protein inAX912. Interestingly, pDHS708 (AT₆-ACP₆-TE) did complement AX912 underculture conditions for pikromycin production resulting in equal amountsof 10-deoxymethynolide and narbonolide (FIG. 41). This product patternoccurs due to formation of hetero- and homodimeric structures of PikAIVas shown in FIG. 42E and FIG. 42F, respectively. These results are inaccord with a model in which an N-terminal truncated form of PikAIV isresponsible for 10-deoxymethynolide formation while expression offull-length PikAIV is responsible for narbonolide production.

[0298] Comparing the complementation of pDHS705 (ACP₆-TE) and pDHS706(ACP₆-TE::TEII) further revealed the activity of pik TEII. Although TEIIalone is not sufficient for polyketide termination (as shown in pDHS702complementation, see FIG. 41), the independent thioesterase did enhancethe production of both 10-deoxymethynolide and narbonolide (FIG. 41).Particularly in the case of narbonolide formation, the presence of TEIIin pDHS706 (ACP₆-TE::TEII) complementation helped to boost polyketideproduction to a level that was otherwise undetectable in AX912 (pDHS705(ACP₆-TE)). This accessory role of TEII is consistent with previousobservations in the pikromycin system (Xue et al., 1998), as well aswith other PKS (Rangaswamy et al., 1998) and non-ribosomal peptidesynthetase (NRPS) systems (Schneider et al., 1998).

[0299] Mechanistic Models for the Alternative Termination by PikAIV. Thecomplementation experiments described above strongly suggest that TE isthe key enzymatically active domain in the truncated PikAIV polypeptide,although the entire protein (including AT, ACP, TE, and probably apartial KS domain) is much more effective for polyketide production. Astructural model based on the proposed helical form of the erythromycinPKS complex (Staunton et al., 1996) was developed to illustrate the roleof PikAIV in alternative termination in the pik-encoded PKS. Underconditions for pikromycin production, wild type S. venezuelae expressesa full length PikAIV module, which interacts with PikAIII and elongatesthe growing polyketide chain on ACP₅ by adding a methylmalonate unit(the activity of KS₆) to ultimately produce the 14-membered ringmacrolactone, narbonolide (FIG. 42A). On the other hand, the truncatedform of PikAIV that lacks KS₆ is expressed under culture conditions formethymycin production. The molecular space left unoccupied by KS₆truncation is then presumably filled by the TE domain that would bealigned to interact directly with ACP₅ to release the 12-membered ringmacrolactone (FIG. 42B). In both cases, the main part of PikAIV ispredicted to remain fixed. A small movement of the TE domain into theunoccupied space (left by KS₆ truncation) would result in the bypass ofthe AT₆-ACP₆ catalytic domains in the truncated PikAIV, while retainingthioesterase activity. Evidently, the main function of truncated PikAIVis to serve as a scaffold that orients the TE domain and stabilizes theinteracting complex between PikAIiU and PikAIV, therefore, greatlyincreasing the production of 10-deoxymethynolide.

[0300] Efficient production of 10-deoxymethynolide by a truncated formof PikAIV suggests that the AT, rather than the KS domain plays apivotal role in the structure and function of modular PKS. TheKS₆-truncated form of PikAIV generated from the pDHS708 (AT₆-ACP₆-TE)complementation plasmid probably forms a heterodimer with the product ofthe corresponding AX912 chromosomal allele to generate narbonolide (FIG.42E), and it also efficiently forms a homodimer to produce10-deoxymethynolide (FIG. 42F). However, this dimerization capacity wasseverely limited when the AT₆ domain was truncated in pDHS705 (ACP₆-TE).Furthermore, the complete absence of complementation by pDHS704 (TEalone) to give narbonolide (under culture conditions for pikromycinproduction) suggests that a dominant interaction exists between KS₆ andPikAII (FIG. 42D), which may be the primary basis of module-modulerecognition and docking in multifunctional PKS systems. The pikA systemin S. venezuelae provides a unique opportunity as well as a powerfultool to study these fundamental interactions in further detail.

[0301] It is valuable to compare alternative termination by differentialexpression of PikAIV in S. venezuelae with engineered polyketidechain-length manipulations from other PKS systems. In the erythromycinPKS, the TE domain from EryAII was moved to upstream domains andcovalently linked to alternative ACPs resulting in truncated polyketides(Cortes et al, 1995; Kao et al., 1995). In each case, the capacity forproducing the full-length polyketide product was subsequentlyeliminated. In contrast, by linking the TE domain of PikAIV to anupstream module by protein-protein interactions, S. venezuelae retainsthe capacity to generate two alternative-sized macrolactones. Sequenceanalysis (Xue et al., 1998) suggested that the pikA may have evolvedfrom a six-module PKS that generated a 14-membered ring macrolactone. Itis, therefore, interesting to consider that the structural andregulatory evolution of pikA to produce the rare 12-membered ringmacrolactone may be the result of endogenous genetic selection toovercome antibiotic resistance within the ecological milieu of theantibiotic producing microorganism. The pikA system provides a naturalexample of a branched metabolic pathway with the capacity to generatemultiple macrolactone systems that may be readily exploited forcombinatorial biosynthetic creation of novel natural products.

Example 11

[0302] A mutant of S. venezuelae (KdesV-41) was constructed that had thedesV gene disrupted (Zhao et al., J. Am. Chem. Soc., 120, 12159 (1998)).Since desV encodes the 3-aminotransferase that catalyzes the conversionof the 3-keto sugar 17 (FIG. 42) to the corresponding amino sugar 4,deletion of this gene should prevent C-3 transamination, resulting inthe accumulation of 17. It was expected that if the glycosyltransferase(DesVII) of this pathway is capable of recognizing and processing theketo sugar intermediate 17, the macrolide product(s) produced by theKdesV-41 mutant should have an attached 3-keto sugar. Surprisingly, thetwo products isolated were the methymycin/neomethymycin analogues 18 and19, each carrying a 4,6-dideoxyhexose (FIG. 43). While this resultdemonstrated a relaxed specificity for the glycosyltransferase towardits sugar substrate, it also indicated the existence of apathway-independent reductase in S. venezuelae that canstereospecifically reduce the C-3 keto group of the sugar metabolite.

[0303] To explore the possibility of generating a mutant capable ofsynthesizing new macrolides of this class containing an engineeredsugar, the desi gene, which has been proposed to encode the dehydraseresponsible for the C-4 deoxygenation in the biosynthesis of desosamine,was altered with the prediction that it would lead to the incorporationof D-quinovose (22; FIG. 44), also known as 6-deoxy-D-glucose, into thefinal product(s). The rationale was based on the following: (1)Desosamine biosynthesis will be “terminated” at the C-4 deoxygenationstep due to desI deletion and, thus, should result in the accumulationof 3-keto-6-deoxyhexose 16 (FIG. 42). (2) By taking advantage of theexistence of a 3-ketohexose reductase in S. venezuelae, the sugarintermediate 15 is expected to be reduced stereospecifically toD-quinovose (22). (3) The glycosyltransferase (DesVII), with its relaxedspecificity toward the sugar substrate, should catalyze the coupling of22 to the macrolactones to give new macrolides 20 and 21 containing theengineered sugar D-quinovose (FIG. 44).

[0304] A disruption plasmid, pDesI-K, derived from pKC1139 that containsan apramycin resistant marker, was constructed in which desi wasreplaced by the neomycin resistance gene, which also confers resistanceto kanamycin. This construct was then introduced into wild type S.venezuelae by conjugal transfer using Escherichia coli S17-1 as thedonor strain (Bierman et al., 1992). Several double crossover mutantswere identified on the basis of their phenotypes of kanamycin resistant(Kan^(R)) and apramycin sensitive (Apr^(S)). One mutant, KdesI-80, wasselected and grown at 29° C. in seed medium (100 mL) for 48 hours andthen inoculated and grown in vegetative medium (5 L) for another 48hours (Cane et al., 1993). The fermentation broth was centrifuged toremove cellular debris and mycelia, and the supernatant was adjusted topH 9.5 with concentrated potassium hydroxide solution. The resultingsolution was extracted with chloroform, and the pooled organic extractswere dried over sodium sulfate and evaporated to dryness. The yellow oilwas subjected to flash chromatography on silica gel using a gradient of0-12% methanol in chloroform, and the isolated products were furtherpurified by HPLC using a C₁₈ column eluted isocratically with 50%acetonitrile in water. As expected, no methymycin or neomethymycin wasdetected; instead, 10-deoxymethynolide 23 was found as the major product(approximately 600 mg). Significant quantities of methynolide 24(approximately 40 mg) and neomethynolide 25 (approximately 2 mg) werealso isolated (FIG. 45). A new macrolide 15 containing D-quinovose (3.2mg) was produced by this mutant. Its structure was fuilly established byspectral analyses. Spectral data (J values are in hertz) for 15: ¹H NMR(CDCl₃) 67 6.76 (1H, dd, J=16.0, 5.5, 9-H), 6.43 (1H, d, J=16.0, 8-H),4.97 (1H, ddd, J=8.4, 5.9, 2.5, 11-H), 4.29 (1H, d, J=8.0, 1′-H), 3.62(1H, d, J=10.5, 3-H), 3.49 (1H, t, J=9.0, 3′-H), 3.36 (1H, dd, J=9.0,8.0, 2′-H), 3.32 (1H, dq, J=8.5, 5.5, 5′-H), 3.23 (1H, dd, J=9.0, 8.5,4′-H), 2.82 (1H, dq, J=10.5, 7.0, 2-H), 2.64 (1H, m, 10-H), 2.55 (1H, m,6-H), 1.70 (1H, m, 12a-H), 1.66 (1H, bt, J=12.5, 5b-H), 1.56 (1H, m,12b-H), 1.40 (1H, dd, J=12.5, 4.5, 5a-H), 1.35 (3H, d, J=7.0, 2-Me),1.31 (3H, d, J=5.5, 5′-Me), 1.24 (1H, bdd, J=10.0, 4.5, 4-H), 1.21 (3H,d, J=7.0, 6-Me), 1.11 (3H, d, J=6.5, 10-Me), 1.00 (3H, d, J=7.0, 4-Me),0.92 (3H, t, J=7.5, 12-Me); ¹³C NMR (CDCl₃) 67 205.0 (C-7), 174.7 (C-1),146.9 (C-9), 125.9 (C-8), 102.9 (C-1′), 85.4 (C-3), 76.5 (C-3′), 75.5(C-4′), 74.7 (C-2′), 73.9 (C-11), 71.6 (C-5′) 45.0 (C-6), 43.9 (C-2),37.9 (C-10), 34.1 (C-5), 33.4 (C-4), 25.2 (C-12), 17.7 (6-Me), 17.5(5′-Me), 17.4 (4-Me), 16.2 (2-Me), 10.3 (12-Me), 9.6 (10-Me);high-resolution FAB-MS calculated for C₂₃H₃₈O₈ (M+H)⁺443.2644, found443.2661.

[0305] The fact that macrolide 15 containing D-quinovose is indeedproduced by the desI mutant is significant. First, the formation ofquinovose as predicted further corroborates the presence of apathway-independent reductase in S. venezuelae that reduces the 3-ketosugars. Interestingly, this reductase is able to act on the 4,6-dideoxysugar 17 as well as the 6-deoxy sugar 16, suggesting that it isoblivious to the presence of a hydroxyl group at C-4. However, it is notclear at this point whether the reduction occurs on the free sugar orafter it is appended to the aglycone. Second, the retention of the 4-OHin quinovose as a result of desI deletion provides strong evidencesupporting the assigned role of desi to encode a C-4 dehydrase.Moreover, the results again show that the glycosyltransferase (DesVII)of this pathway can recognize alternative sugar substrates whosestructures are considerably different from the original amino sugarsubstrate desosamine. While the incorporation of quinovose is important,another noteworthy, albeit unexpected, result was the fact that theaglycone of the isolated macrolide 15 was 10-deoxy-methynolide 23instead of methynolide 24 and neomethynolide 25. It is possible that thecytochrome P450 hydroxylase (PikC), which catalyzes the hydroxylation of10-deoxy-methynolide at either its C-10 or C-12 position (Xue et al.,Chem. Biol., 5, 661 (1998)), is sensitive to structural variations inthe appended sugar. It could be argued that the presence of the 4-OHgroup in the sugar moiety is somehow responsible for decreasing orpreventing hydroxylation of the macrolide.

[0306] Thus, the results demonstrate the feasibility of combiningpathway-dependent genetic manipulations and pathway-independentenzymatic reactions to engineer a sugar of designed structure. It isconceivable that the pathway-independent enzymes could also be used inconcert with the natural biosynthetic machinery to generate furtherstructural diversity, which can provide an array of random compounds.

REFERENCES

[0307] Andersen, J. R., Hutchinson, C. R. J. Bacteriol., 174:725-735(1992).

[0308] Aparicio, J. F., Molnar, I., Schwecke, T., Konig, A., Haydock, S.F., Khaw, L. E., Staunton, J., Leadlay, P. F. Gene, 169:9-16 (1996).

[0309] Arisawa, A., Kawamura, N., Takeda, K., Tsunekawa, H., Okamura,K., Okamoto, R. Appl. Environ. Microbiol., 60:2657-2660 (1994).

[0310] August, P. R., Tang, L., Yoon, Y. J., Ning, S., Muller, R., Yu,T. W., Taylor, M., Hoffmnann, D., Kim, C. G., Zhang, X., Hutchinson, C.R. & Floss, H. G. Chem. Biol., 5:69-79 (1998).

[0311] Baltz, R. H., Seno, E. T. Annu. Rev. Microbiol., 42:547-574(1988).

[0312] Bibb, M. J., Bibb, M. J., Ward, J. M., Cohen, S. N. Mol. Gen.Genet., 199:26-36 (1985).

[0313] Bierman, M., Logan, R., O'Brien, K., Seno, G., Nagaraja, R.,Schoner, B. E. Gene, 116:43-49 (1992).

[0314] Box, R. P. Clin. Infect. Dis., 24:S151 (1997).

[0315] Cane, D. E., Lamnbalot, R. H., Prabhakaran, P. C., Ott, W. R. J.Am. Chem. Soc., 115:522-526 (1993).

[0316] Carreras, C. W., Pieper, R., Khosla, C. In Bioorganic ChemistryDeoxysugars, Polyketides & Related Classes: Synthiesis, Biosynthesis,Enzymes, Rohr, J. (ed.), Springer:Berlin, 85-126 (1997).

[0317] Castle, L. A., Smith, K. D., Morris, R. O. J. Bacteriol.,174:1478-1486 (1992).

[0318] Celmer, W. D., Nagel, A. A., Wadlow, J. W., Tatematsu, H.,Ikenaga, S., Nakanishi, S. Abstracts of Papers of 24th Intersci. Conf.on Antimicrob. Agents Chemother., No. 1142, Washington, D.C. (1985).

[0319] Cortes, J. Haydock, S. F., Roberts, G. A., Bevitt, D. J.,Leadlay, P. F. Nature, 348:176-8 (1990).

[0320] Cortes, J., Wiesmann, K. E., Roberts, G. A., Brown, M. J.,Staunton, J., Leadlay, P. F. Science, 268:1487-9 (1995).

[0321] Cundliffe, E. C. Annu. Rev. Microbiol., 43:207-233 (1989).

[0322] Cundliffe, E. Antimicrob. Agents Chemother., 36:348-352 (1992).

[0323] Davies, J. Nature, 383:219-220 (1996).

[0324] Denis, F., Brzezinski, R. Gene, 111:115-118 (1992).

[0325] Djerassi, C., Zderic, J. A. J. Am. Chem. Soc., 78:6390-6395(1956).

[0326] Donadio, S., McAlpine, J. B., Sheldon, P. J., Jackson, M., Katz,L. Proc. Natl. Acad. Sci, U.S.A., 90:7119-7123 (1993).

[0327] Donadio, S., Staver, M. J., McAlpine, J. B., Swanson, S. J.,Katz, L. Science, 252:675-9 (1991).

[0328] Donadio, S., Katz, L. Gene, 111:51-60 (1992).

[0329] Donin, M. N., Pagano, J., Dutcher, J. D., McKee, C. M.Antibiotics Annu., 1:179-185 (1953-1954).

[0330] Epp, J., Huber, M. L. B., Tuner, J. R., Goodson, T., Schoner, B.E. Gene, 85:293-301 (1989).

[0331] Flinn, E. H., Sigal, M. V., Jr., Wiley, P. F., Gerzon, K. J. Am.Chem. Soc., 76:3121-3131(1954).

[0332] Gaisser, S., Bohm, G. A., Cortes, J., Leadlay, P. F. Mol. Gen.Genet., 256:239-251 (1997).

[0333] Gandecha, A. R., Large, S. L., Cundliffe, E. Gene, 184:197-203(1997).

[0334] Geistlich, M., Losick, R., Turner, J. R., Rao, R. N. Mol.Microbiol., 6:2019-2029 (1992).

[0335] Gokhale, R. S., Hunziker, D., Cane, D. E., Khosla, C. Chem.Biol., 6:117-125 (1999).

[0336] Haydock, S. F., Dowson, J. A., Dhillon, N., Roberts, G. A.,Cortes, J., Leadlay, P. F. Mol. Gen. Genet., 230:120-128 (1991).

[0337] Hernandez, C., Olano, C., Mendez, C., Salas, J. A. Gene,134:139-140 (1993).

[0338] Hopwood, D. A., Sherman, D. H. Annu. Rev. Genet., 24:37-66(1990).

[0339] Hopwood, D. A., Malpartida, F., Kieser, H. M., Ikeda, H., Duncan,J., Fujii, I., Rudd, B. A., Floss, H. G., Omura, S. Nature, 314:642-644(1985).

[0340] Hopwood, D. A., Bibb, M. J., Chater, K. J., Kieser, T., Bruton,C. J., Kieser, H. M., Lydiate, D. J., Smith, C. P., Ward, J. M.,Schrempf, H., Genetic Manipulation of Streptomyces: A Laboratory Manual(The John Innes Foundation) (1985).

[0341] Hori et al., Chem. Comm., 304 (1971).

[0342] Hutchinson, C. R., Fujii, I. Annu. Rev. Microbiol., 49:201-238(1995).

[0343] Ingrosso, D., Fowler, A. V., Bleibaum, J., Clarke, S. J. Biol.Chem., 264:20130-20139 (1989).

[0344] Jacobsen, J. R., Hutchinson, C. R., Cane, D. E., Khosla, C.Science, 277:367-369 (1997).

[0345] Jenksins, G., Cundliffe, E. Gene, 108, 55-62 (1991).

[0346] Kakavas, S. J., Katz, L., Stassi, D. J. Bacteriol., 179:7515-22(1997).

[0347] Kao, C.M., Luo, G.L., Katz, L., Cane, D.E., Khosla, C. J. Am.Chem. Soc., 117:9105-9106 (1995).

[0348] Katz, L., Donadio, S. Annu. Rev. Microbiol., 47:875-912 (1993).

[0349] Katz, L., Chem. Rev., 97:2557-2575 (1997).

[0350] Khosla, C., Chemn. Rev., 97:2577-2590 (1997).

[0351] Khosla, C., Zawada, R. J. Trends Biotechnol., 14:335-341 (1996).

[0352] Kirschning, A., Bechthold, A. F. -W., Rohr, J. In BioorganicChemistry Deoxysugars, Polyketides & Related Classes: Synthesis,Biosynthesis, Enzymes, Rohr, J. (ed.), Springer:Berlin 1-84 (1997).

[0353] Kramer, P. J., Khosla, C. Annu. N.Y. Acad. Sci., 799:32-45(1996).

[0354] Kuo, M.-S., Chirby, D. G., Argoudelis, A. D., Cialdella, J. I.,Coats, J. H., Marshall, V. P. Antimicrob. Agents Chemother.,33:2089-2091 (1989).

[0355] Lambalot, R. H., Cane, D. E. J. Antibiot., 45:1981-1982 (1992).

[0356] Lin, E. C. C., Goldstein, R., Syvanen, M. Bacteria, Plasmids, andPhages, An Introduction to Molecular Biology, Harvard UniversityPress:Cambridge, p. 123 (1984).

[0357] Liu, H. -w., Thorson, J. S. Annu. Rev. Microbiol., 48:223-256(1994).

[0358] Lydiate, D. J., Malpartida, F., Hopwood, D. A. Gene, 35:223-235(1985).

[0359] Madduri, K., Kennedy, J., Rivola, G., Inventi-Solari, A.,Filippini, S., Zanuso, G., Colombo, A. L., Gewain, K. M., Occi, J. L.,MacNeil, D. J., Hutchinson, C. R. Nature Biotech., 16:69-74 (1998).

[0360] Mangahas, F.R. MS Thesis, University of Minnesota, 1996.

[0361] Marahiel, M. A., Stachelhaus, T., Mootz, H. D., Chem. Rev.,97:2651-2673 (1997).

[0362] Marsden, A. F. A., Wilkinson, B., Cortes, J., Dunster, N. J.,Staunton, J., Leadlay, P. F. Science, 279:199-201 (1998).

[0363] Merson-Davies, L. A., Cundliffe, E. Mol. Microbiol., 13:349-355(1994).

[0364] Merson-Davies, L. A., Cundliffe, E. Mol. Microbiol., 13:347-355(1994).

[0365] Motamedi, H., Cai, S. J., Shafiee, A., Elliston, K. O. Eur. J.Biochem., 244:74-80 (1997).

[0366] Muth, G., Nubhaumer, B., Wohlleben, W., Puhler, A. Mol. Gene.Genet., 219:341-348 (1989).

[0367] Niemi, J., Mantsala, P. J. Bacteriol., 177:2942-2945 (1995).

[0368] Omura, S. (ed.) Macrolide Antibiotics, Chemistry, Biology, andPractice, Academic Press:New York (1984).

[0369] Omuras et al., J. Antibio., 29, 316 (1971).

[0370] Rangaswamy, V., Mitchell, R., Ullrich, M., Bender, C. J.Bacteriol., 180:3330-3338 (1998).

[0371] Sambrook, J., Fritsch, E. F., Maniatis, T. Molecular Cloning: ALaboratory Manual (Cold Spring Harbor Laboratory Press), 2nd edition(1989).

[0372] Sasaki, J., Mizoue, K., Morimoto, S., Omura, S. J. Antibiotics,49:1110-1118 (1996).

[0373] Schneider, A., Marahiel, M. A., Arch. Microbiol., 169:404-410(1998).

[0374] Schupp, T., Toupet, C., Cluzel, B., Neff, S., Hill, S., Beck, J.J., Ligon, J. M., J. Bacteriol., 177:3673-9 (1995).

[0375] Schwecke, T., Aparicio, J. F., Molnar, I., Konig, A., Khaw, L.E., Haydock, S. F., Oliynyk, M., Caffrey, P., Cortes, J., Lester, J. B.,et al. Proc. Natl. Acad. Sci. U.S.A., 92:7839-7843 (1995).

[0376] Seo, S., Tomita, Y., Tori, K., Yoshimura, Y. J. Am. Chem. Soc.,100:3331-3339 (1978).

[0377] Service, R. F. Science, 270:724-727 (1995).

[0378] Stassi, D., Donadio, S., Staver, M. J., Katz, L. J. Bacteriol.,175:182-189 (1993).

[0379] Staunton, J., Caffrey, P., Aparicio, J. F., Roberts, G. A.,Bethell, S. S., Leadlay, P. F. Nat. Struct. Biol., 3:188-192 (1996).

[0380] Staunton, J., Wilkinson, B., Chem. Rev., 97:2611-2629 (1997).

[0381] Summers, R. G., Donadio, S., Staver, M. J., Wendt-Pienkowski, E.,Hutchinson, C. R., Katz, L. Microbiology, 143:3251-3262 (1997).

[0382] Swan, D. G., Rodriguez, A. M., Vilches, C., Mendez, C., Salas, J.A. Mol. Gen. Genet., 242:358-362 (1994).

[0383] Tuan, J. S., Weber, J. M., Staver, M. J., Leung, J. O., Donadio,S., Katz, L. Gene, 90:21-29 (1990).

[0384] Vilches, C., Hernandez, C., Mendez, C., Salas, J. A. J.Bacteriol., 174:161-165 (1992).

[0385] von Heijne, G. Nucleic Acids Res., 14:4683-4690 (1986).

[0386] von Heijne, G., Abrahmsen, L. FEBS Lett., 244:439-446 (1989).

[0387] Weber, J. M., Leung, J. O., Swanson, S. J., Idler, K. B.,McAlpine, J. B. Science, 252:114-117 (1991).

[0388] Xue, Y., Zhao, L., Liu, H. -w., Sherman, D. H. Proc. Natl. Acad.Sci. U.S.A., 95: 12111-12116 (1998).

[0389] The complete disclosure of all patents, patent documents andpublications cited herein are incorporated herein by reference as ifindividually incorporated. The foregoing detailed description andexamples have been given for clarity of understanding only. Nounnecessary limitations are to be understood therefrom. The invention isnot limited to the exact details shown and described for variationsobvious to one skilled in the art will be included within the inventiondefined by the claims.

1 53 1 15872 DNA Streptomyces venezuelae 1 ttaattaagg aggaccatcatgaacgaggc catcgccgtc gtcggcatgt cctgccgcct 60 gccgaaggcc tcgaacccggccgccttctg ggagctgctg cggaacgggg agagcgccgt 120 caccgacgtg ccctccggccggtggacgtc ggtgctcggg ggagcggacg ccgaggagcc 180 ggcggagtcc ggtgtccgccggggcggctt cctcgactcc ctcgacctct tcgacgcggc 240 cttcttcgga atctcgccccgtgaggccgc cgccatggac ccgcagcagc gactggtcct 300 cgaactcgcc tgggaggcgctggaggacgc cggaatcgtc cccggcaccc tcgccggaag 360 ccgcaccgcc gtcttcgtcggcaccctgcg ggacgactac acgagcctcc tctaccagca 420 cggcgagcag gccatcacccagcacaccat ggcgggcgtg aaccggggcg tcatcgccaa 480 ccgcgtctcg taccacctcggcctgcaggg cccgagcctc accgtcgacg ccgcgcagtc 540 gtcctcgctc gtcgccgtgcacctggcctg cgagtccctg cgcgccgggg agtccacgac 600 ggcgctcgtc gccggcgtgaacctcaacat cctcgcggag agcgccgtga cggaggagcg 660 cttcggtgga ctctccccggacggcaccgc ctacaccttc gacgcgcggg ccaacggatt 720 cgtccggggc gagggcggcggagtcgtcgt actcaagccg ctctcccgcg ccctcgccga 780 cggcgaccgt gtccacggcgtcatccgcgc cagcgccgtc aacaacgacg gagccacccc 840 gggtctcacc gtgcccagcagggccgccca ggagaaggtg ctgcgcgagg cgtaccggaa 900 ggcggccctg gacccgtccgccgtccagta cgtcgaactc cacggcaccg gaacccccgt 960 cggcgacccc atcgaggccgccgcgctcgg cgccgtcctc ggctcggcgc gccccgcgga 1020 cgaacccctg ctcgtcggctcggccaagac gaacgtcggg cacctcgaag gcgccgccgg 1080 catcgtcggc ctcatcaagacgctcctcgc gctcggccgg cgccggatcc cggcgagcct 1140 caacttccgt acgccccacccggacatccc gctcgacacc ctcgggctcg acgtgcccga 1200 cggcctgcgg gagtggccgcacccggaccg cgaactcctc gccggcgtca gctcgttcgg 1260 catgggcggc accaacgcccacgtcgtcct cagcgaaggc cccgcccagg gcggcgagca 1320 gcccggcatc gatgaggagacccccgtcga cagcggggcc gcactgccct tcgtcgtcac 1380 cggccgcggc ggcgaggccctgcgcgccca ggcccggcgc ctgcacgagg ccgtcgaagc 1440 ggacccggag ctcgcgcccgccgcactcgc ccggtcgctg gtcaccaccc gtacggtctt 1500 cacgcaccgg tcggtcgtcctcgccccgga ccgcgcccgc ctcctcgacg gcctcggcgc 1560 cctcgccgcc gggacgcccgcgcccggcgt ggtcaccggc acccccgccc ccgggcgcct 1620 cgccgtcctg ttcagcggccagggtgccca acgtacgggc atgggcatgg agttgtacgc 1680 cgcccacccc gccttcgcgacggccttcga cgccgtcgcc gccgaactgg accccctcct 1740 cgaccggccc ctcgccgaactcgtcgcggc gggcgacacc ctcgaccgca ccgtccacac 1800 acagcccgcg ctcttcgccgtggaggtcgc cctccaccgc ctcgtcgagt cctggggcgt 1860 cacgcccgac ctgctcgccggccactccgt cggcgagatc agcgccgccc acgtcgccgg 1920 ggtcctgtcg ctgcgcgacgccgcccgcct cgtcgcggcg cgcggccgcc tcatgcaggc 1980 gctccccgag ggcggcgcgatggtcgcggt cgaggcgagc gaggaggaag tgcttccgca 2040 cctcgcggga cgcgagcgggagctctccct cgcggccgtg aacggccccc gcgcggtcgt 2100 cctcgcgggc gccgagcgcgccgtcctcga cgtcgccgag ctgctgcgcg aacagggccg 2160 ccggacgaag cggctcagcgtctcgcacgc cttccactcg ccgctcatgg agccgatgct 2220 cgacgacttc cgccgggtcgtcgaagagct ggacttccag gagccccgcg tcgacgtcgt 2280 gtccacggtg acgggcctgcctgtcacagc gggccaatgg accgatcccg agtactgggt 2340 ggaccaggtc cgcaggcccgtacgcttcct cgacgccgta cgcaccctgg aggaatcggg 2400 cgccgacacc ttcctggagctcggtcccga cggggtctgc tccgcgatgg cggcggactc 2460 cgtacgcgac caggaggccgccacggcggt ctccgccctg cgcaagggcc gcccggagcc 2520 ccagtcgctg ctcgccgcactcaccaccgt cttcgtccgg ggccacgacg tcgactggac 2580 cgccgcgcac gggagcaccggcacggtcag ggtgcccctg ccgacctacg ccttccagcg 2640 cgaacgccac tggttcgacggcgccgcgcg aacggcggcg ccgctcacgg cgggccgatc 2700 gggcaccggt gcgggcaccggcccggccgc gggtgtgacg tcgggcgagg gcgagggcga 2760 gggcgagggc gcgggtgcgggtggcggtga tcggccggct cgccacgaga cgaccgagcg 2820 cgtgcgcgca cacgtcgccgccgtcctcga gtacgacgac ccgacccgcg tcgaactcgg 2880 cctcaccttc aaggagctgggcttcgactc cctcatgtcc gtcgagctgc ggaacgcgct 2940 cgtcgacgac acgggactgcgcctgcccag cggactgctc ttcgaccacc cgacgccgcg 3000 cgccctcgcc gcccacctgggcgacctgct caccggcggc agcggcgaga ccggatcggc 3060 cgacgggata ccgcccgcgaccccggcgga caccaccgcc gagcccatcg cgatcatcgg 3120 catggcctgc cgctaccccggcggcgtcac ctcccccgag gacctgtggc ggctcgtcgc 3180 cgaggggcgc gacgccgtctcggggctgcc caccgaccgc ggctgggacg aggacctctt 3240 cgacgccgac cccgaccgcagcggcaagag ctcggtccgc gagggcggat tcctgcacga 3300 cgccgccctg ttcgacgccggcttcttcgg gatatcgccc cgcgaggccc tcggcatgga 3360 cccgcagcag cggctgctcctggagacggc atgggaggcc gtggagcgcg cagggctcga 3420 ccccgaaggc ctcaagggcagccggacggc cgtcttcgtc ggcgccaccg ccctggacta 3480 cggcccgcgc atgcacgacggcgccgaggg cgtcgagggc cacctcctga ccgggaccac 3540 gcccagcgtg atgtcgggccgcatcgccta ccagctcggc ctcaccggtc ctgcggtcac 3600 cgtcgacacg gcctgctcgtcctcgctcgt cgcgctgcac ctggccgtcc gttcgctgcg 3660 gcagggcgag tcgagcctcgcgctcgccgg cggagcgacc gtcatgtcga caccgggcat 3720 gttcgtcgag ttctcgcggcagcgcggcct cgccgccgac ggccgctcca aggccttctc 3780 cgactccgcc gacggcacctcctgggccga gggcgtcggc ctcctcgtcg tcgagcggct 3840 ctcggacgcc gagcgcaacggccaccccgt gctcgccgtg atccggggca gcgcggtcaa 3900 ccaggacggc gcctccaacgggctcaccgc ccccaacggc ccgtcccagc agcgcgtcat 3960 ccgacaggcc ctggccgacgccgggctcac cccggccgac gtcgacgccg tcgaggcgca 4020 cggtacgggt acccggctcggcgaccccat cgaggccgag gcgatcctcg gcacctacgg 4080 ccgggaccgg ggcgagggcgctccgctcca gctcggctcg ctgaagtcga acatcggcca 4140 cgcgcaggcc gccgcgggcgtgggcgggct catcaagatg gtcctcgcga tgcgccacgg 4200 cgtcctgccc aggacgctccacgtggaccg gcccaccacc cgcgtcgact gggaggccgg 4260 cggcgtcgag ctcctcaccgaggagcggga gtggccggag acgggccgcc cgcgccgcgc 4320 ggcgatctcc tccttcggcatcagcggcac caacgcccac atcgtggtcg aacaggcccc 4380 ggaagccggg gaggcggcggtcaccaccac cgccccggaa gcaggggaag ccggggaagc 4440 ggcggacacc accgccaccacgacgccggc cgcggtcggc gtccccgaac ccgtacgcgc 4500 ccccgtcgtg gtctccgcgcgggacgccgc cgccctgcgc gcccaggccg ttcggctgcg 4560 gaccttcctc gacggccgaccggacgtcac cgtcgccgac ctcggacgct cgctggccgc 4620 ccgtaccgcc ttcgagcacaaggccgccct caccaccgcc accagggacg agctgctcgc 4680 cgggctcgac gccctcggccgcggggagca agccacgggc ctggtcaccg gcgaaccggc 4740 cagggccgga cgcacggccttcctgttcac cggccaggga gcgcagcgcg tcgccatggg 4800 cgaggaactg cgcgccgcgcaccccgtgtt cgccgccgcc ctcgacaccg tgtacgcggc 4860 cctcgaccgt cacctcgaccggccgctgcg ggagatcgtc gccgccgggg aggagctgga 4920 cctcaccgcg tacacccagcccgccctctt cgccttcgag gtggcgctgt tccgcctcct 4980 cgaacaccac ggcctcgtccccgacctgct caccggccac tccgtcggcg agatcgccgc 5040 cgcgcacgtc gccggtgtcctctccctcga cgacgccgca cgtctcgtca ccgcccgcgg 5100 ccggctcatg cagtcggcccgcgagggcgg cgcgatgatc gccgtgcagg cgggcgaggc 5160 cgaggtcgtc gagtccctgaagggctacga gggcagggtc gccgtcgccg ccgtcaacgg 5220 acccaccgcc gtggtcgtctccggcgacgc ggacgccgcc gaggagatcc gcgccgtatg 5280 ggcgggacgc ggccggcgcacccgcaggct gcgcgtcagc cacgccttcc actccccgca 5340 catggacgac gtcctcgacgagttcctccg ggtcgccgag ggcctgacct tcgaggagcc 5400 gcggatcccc gtcgtctccacggtcaccgg cgcgctcgtc acgtccggcg agctcacctc 5460 gcccgcgtac tgggtcgaccagatccggcg gcccgtgcgc ttcctggacg ccgtccgcac 5520 cctggccgcc caggacgcgaccgtcctcgt cgagatcggc cccgacgccg tcctcacggc 5580 actcgccgag gaggctctcgcgcccggcac ggacgccccg gacgcccggg acgtcacggt 5640 cgtcccgctg ctgcgcgcggggcgccccga gcccgagacc ctcgccgccg gtctcgcgac 5700 cgcccatgtc cacggcgcacccttggaccg ggcgtcgttc ttcccggacg ggcgccgcac 5760 ggacctgccc acgtacgccttccggcgcga gcactactgg ctgacgcccg aggcccgtac 5820 ggacgcccgc gcactcggcttcgacccggc gcggcacccg ctgctgacga ccacggtcga 5880 ggtcgccggc ggcgacggcgtcctgctgac cggccgtctc tccctgaccg accagccctg 5940 gctggccgac cacatggtcaacggcgccgt cctgttgccg gccaccgcct tcctggagct 6000 cgccctcgcg gcgggcgaccacgtcggggc ggtccgggtg gaggaactca ccctcgaagc 6060 gccgctcgtc ctgcccgagcggggcgccgt ccgcatccag gtcggcgtga gcggcgacgg 6120 cgagtcgccg gccgggcgcaccttcggtgt gtacagcacc cccgactccg gcgacaccgg 6180 tgacgacgcg ccccgggagtggacccgcca tgtctccggc gtactcggcg aaggggaccc 6240 ggccacggag tcggaccaccccggcaccga cggggacggt tcagcggcct ggccgcctgc 6300 ggcggcgacc gccacacccctcgacggcgt ctacgaccgg ctcgcggagc tcggctacgg 6360 atacggtccg gccttccagggcctgacggg gctgtggcgc gacggcgccg acacgctcgc 6420 cgagatccgg ctgcccgcggcgcagcacga gagcgcgggg ctcttcggcg tacacccggc 6480 gctgctcgac gcggcgctccacccgatcgt cctggagggc aactcagctg ccggtgcctg 6540 tgacgccgat accgacgcgaccgaccggat ccggctgccg ttcgcgtggg cgggggtgac 6600 cctccacgcc gaaggggccaccgcgctccg cgtacggatc acacccaccg gcccggacac 6660 ggtcacgctc cgcctcaccgacaccaccgg tgcgcccgtg gccaccgtgg agtccctgac 6720 cctgcgcgcg gtggcgaaggaccggctggg caccaccgcc gggcgcgtcg acgacgccct 6780 gttcacggtc gtgtggacggagaccggcac accggaaccc gcagggcgcg gagccgtgga 6840 ggtcgaggaa ctcgtcgacctcgccggcct cggcgacctc gtggagctcg gcgccgcgga 6900 cgtcgtcctc cgggccgaccgctggacgct cgacggggac ccgtccgccg ccgcgcgcac 6960 agccgtccgg cgcaccctcgccatcgtcca ggagttcctg tccgagccgc gcttcgacgg 7020 ctcgcgactg gtgtgcgtcaccaggggcgc ggtcgccgca ctccccggcg aggacgtcac 7080 ctccctcgcc accggccccctctggggcct cgtccgctcc gcccagtccg agaacccggg 7140 acgcctgttc ctcctggacctgggtgaagg cgaaggcgag cgcgacggag ccgaggagct 7200 gatccgcgcg gccacggccggggacgagcc gcagctcgcg gcacgggacg gccgactgct 7260 cgcgccgagg ctggcccgtaccgccgccct ttcgagtgag gacaccgccg gcggcgccga 7320 ccgtttcggc cccgacggcaccgtcctcgt caccgggggc accggaggcc tcggagcgct 7380 cctcgcccgc cacctcgtggagcgtcacgg ggtgcgccgg ctgctgctgg tgagccgccg 7440 cggggccgac gcccccggcgcggccgacct gggcgaggac ctcgcgggcc tcggcgcgga 7500 ggtggcgttc gccgccgccgacgccgccga ccgcgagagc ctggcgcggg cgatcgccac 7560 cgtgcccgcc gagcatccgctgacggccgt cgtgcacacg gcgggagtcg tcgacgacgc 7620 gacggtggag gcgctcacaccggaacggct ggacgcggta ctgcgcccga aggtcgacgc 7680 cgcgtggaac ctgcacgagctcaccaagga cctgcggctc gacgccttcg tcctcttctc 7740 ctccgtctcc ggcatcgtcggcaccgccgg ccaggccaac tacgcggcgg ccaacacggg 7800 cctcgacgcc ctcgccgcccaccgcgccgc cacgggcctg gccgccacgt cgctggcctg 7860 gggcctctgg gacggcacgcacggcatggg cggcacgctc ggcgccgccg acctcgcccg 7920 ctggagccgg gccggaatcaccccgctcac cccgctgcag ggcctcgcgc tcttcgacgc 7980 cgcggtcgcc agggacgacgccctcctcgt acccgccggg ctccgtccca ccgcccaccg 8040 gggcacggac ggacagcctcctgcgctgtg gcgcggcctc gtccgggcgc gcccgcgccg 8100 tgccgcgcgg acggccgccgaggcggcgga cacgaccggc ggctggctga gcgggctcgc 8160 cgcacagtcc cccgaggagcggcgcagcac agccgtcacg ctcgtgacgg gtgtcgtcgc 8220 ggacgtcctc gggcacgccgactccgccgc ggtcggggcg gagcggtcct tcaaggacct 8280 cggcttcgac tccctggccggggtggagct ccgcaaccgg ctgaacgccg ccaccggcct 8340 gcggctcccc gcgaccacggtcttcgacca tccctcgccg gccgcgctcg cgtcccatct 8400 cctcgcccag gtgcccgggttgaaggaggg gacggcggcg accgcgaccg tcgtggccga 8460 gcggggcgct tccttcggtgaccgtgcgac cgacgacgat ccgatcgcga tcgtgggcat 8520 ggcatgccgc tatccgggtggtgtgtcgtc gccggaggac ctgtggcggc tggtggccga 8580 ggggacggac gcgatcagcgagttccccgt caaccgcggc tgggacctgg agagcctcta 8640 cgacccggat cccgagtcgaagggcaccac gtactgccgg gagggcgggt tcctggaagg 8700 cgccggtgac ttcgacgccgccttcttcgg catctcgccg cgcgaggccc tggtgatgga 8760 cccgcagcag cggctgctgctggaggtgtc ctgggaggcg ctggaacgcg cgggcatcga 8820 cccgtcctcg ctgcgcggcagccgcggtgg tgtctacgtg ggcgccgcgc acggctcgta 8880 cgcctccgat ccccggctggtgcccgaggg ctcggagggc tatctgctga ccggcagcgc 8940 cgacgcggtg atgtccggccgcatctccta cgcgctcggt ctcgaaggac cgtccatgac 9000 ggtggagacg gcctgctcctcctcgctggt ggcgctgcat ctggcggtac gggcgctgcg 9060 gcacggcgag tgcgggctcgcgctggcggg cggggtggcg gtgatggccg atccggcggc 9120 gttcgtggag ttctcccggcagaaggggct ggccgccgac ggccgctgca aggcgttctc 9180 ggccgccgcc gacggcaccggctgggccga gggcgtcggc gtgctcgtcc tggagcggct 9240 gtcggacgcg cgccgcgcggggcacacggt cctcggcctg gtcaccggca ccgcggtcaa 9300 ccaggacggt gcctccaacgggctgaccgc gcccaacggc ccagcccagc aacgcgtcat 9360 cgccgaggcg ctcgccgacgccgggctgtc cccggaggac gtggacgcgg tcgaggcgca 9420 cggcaccggc acccggctcggcgaccccat cgaggccggg gcgctgctcg ccgcctccgg 9480 acggaaccgt tccggcgaccacccgctgtg gctcggctcg ctgaagtcca acatcgggca 9540 tgcccaggcc gccgccggtgtcggcggcgt catcaagatg ctccaggcgc tgcggcacgg 9600 cttgctgccc cgcaccctccacgccgacga gccgaccccg catgccgact ggagctccgg 9660 ccgggtacgg ctgctcacctccgaggtgcc gtggcagcgg accggccggc cccggcggac 9720 cggggtgtcc gccttcggcgtcggcggcac caatgcccat gtcgtcctcg aagaggcacc 9780 cgccccgccc gcgccggaaccggccgggga ggcccccggc ggctcccgcg ccgcagaagg 9840 ggcggaaggg cccctggcctgggtggtctc cggacgcgac gagccggccc tgcggtccca 9900 ggcccggcgg ctccgcgaccacctctcccg cacccccggg gcccgcccgc gtgacatcgc 9960 cttctccctc gccgccacgcgcgcagcctt tgaccaccgc gccgtgctga tcggctcgga 10020 cggggccgaa ctcgccgccgccctggacgc gttggccgaa ggacgcgacg gtccggcggt 10080 ggtgcgcgga gtccgcgaccgggacggcag gatggccttc ctcttcaccg ggcagggcag 10140 ccagcgcgcc gggatggcccacgacctgca tgccgcccat accttcttcg cgtccgccct 10200 cgacgaggtg acggaccgtctcgacccgct gctcggccgg ccgctcggcg cgctgctgga 10260 cgcccgaccc ggctcgcccgaagcggcact cctggaccgg accgagtaca cccagccggc 10320 gctcttcgcc gtcgaggtggcgctccaccg gctgctggag cactggggga tgcgccccga 10380 cctgctgctg gggcactcggtgggcgaact ggcggccgcc cacgtcgcgg gtgtgctcga 10440 tctcgacgac gcctgcgcgctggtggccgc ccgcggcagg ctgatgcagc gcctgccgcc 10500 cggcggcgcg atggtctccgtgcgggccgg cgaggacgag gtccgcgcac tgctggccgg 10560 ccgcgaggac gccgtctgcgtcgccgcggt gaacggcccc cggtcggtgg tgatctccgg 10620 cgcggaggaa gcggtggccgaggcggcggc gcagctcgcc ggacgaggcc gccgcaccag 10680 gcggctccgc gtcgcgcacgccttccactc acccctgatg gacggcatgc tcgccggatt 10740 ccgggaggtc gccgccggcctgcgctaccg ggaaccggag ctgacggtcg tctccacggt 10800 cacggggcgg cccgcccgccccggtgaact caccggcccc gactactggg tggcccaggt 10860 ccgtgagccc gtgcgcttcgcggacgcggt ccgcacggca caccgcctcg gagcccgcac 10920 cttcctggag accggcccggacggcgtgct gtgcggcatg gcagaggagt gcctggagga 10980 cgacaccgtg gccctgctgccggcgatcca caagcccggc accgcgccgc acggtccggc 11040 ggctcccggc gcgctgcgggcggccgccgc cgcgtacggc cggggcgccc gggtggactg 11100 ggccgggatg cacgccgacggccccgaggg gccggcccgc cgcgtcgaac tgcccgtcca 11160 cgccttccgg caccgccgctactggctcgc cccgggccgc gcggcggaca ccgacgactg 11220 gatgtaccgg atcggctgggaccggctgcc ggctgtgacc ggcggggccc ggaccgccgg 11280 ccgctggctg gtgatccaccccgacagccc gcgctgccgg gagctgtccg gccacgccga 11340 acgcgcgctg cgcgccgcgggcgcgagccc cgtaccgctg cccgtggacg ctccggccgc 11400 cgaccgggcg tccttcgcggcactgctgcg ctccgccacc ggacctgaca cacgaggtga 11460 cacagccgcg cccgtggccggtgtgctgtc gctgctgtcc gaggaggatc ggccccatcg 11520 ccagcacgcc ccggtacccgccggggtcct ggcgacgctg tccctgatgc aggctatgga 11580 ggaggaggcg gtggaggctcgcgtgtggtg cgtctcccgc gccgcggtcg ccgccgccga 11640 ccgggaacgg cccgtcggcgcgggcgccgc cctgtggggg ctggggcggg tggccgccct 11700 ggaacgcccc acccggtggggcggtctcgt ggacctgccc gcctcgcccg gtgcggcgca 11760 ctgggcggcc gccgtggaacggctcgccgg tcccgaggac cagatcgccg tgcgcgcgtc 11820 cggcagttgg ggccggcgcctcaccaggct gccgcgcgac ggcggcggcc ggacggccgc 11880 acccgcgtac cggccgcgcggcacggtgct cgtcaccggt ggcaccggcg cgctcggcgg 11940 gcatctcgcc cgctggctcgccgcggcggg cgccgaacac ctggcgctca ccagccgccg 12000 gggcccggac gcgcccggcgccgccggact cgaggccgaa ctcctcctcc tgggcgccaa 12060 ggtgacgttc gccgcctgcgacaccgccga ccgcgacggc ctcgcccggg tcctgcgggc 12120 gataccggag gacaccccgctcaccgcggt gttccacgcc gcgggcgtac cgcaggtcac 12180 gccgctgtcc cgtacctcgcccgagcactt cgccgacgtg tacgcgggca aggcggcggg 12240 cgccgcgcac ctggacgaactgacccgcga actcggcgcc ggactcgacg cgttcgtcct 12300 ctactcctcc ggcgccggcgtctggggcag cgccggccag ggtgcctacg ccgccgccaa 12360 cgccgccctg gacgcgctcgcccggcgccg tgcggcggac ggactccccg ccacctccat 12420 cgcctggggc gtgtggggcggcggcggtat gggggccgac gaggcgggcg cggagtatct 12480 gggccggcgc ggtatgcgccccatggcacc ggtctccgcg ctccgggcga tggccaccgc 12540 catcgcctcc ggggaaccctgccccaccgt cacccacacc gactgggagc gcttcggcga 12600 gggcttcacc gccttccggcccagccctct gatcgcgggg ctcggcacgc cgggcggcgg 12660 ccgggcggcg gagacccccgaggaggggaa cgccaccgct gcggcggacc tcaccgccct 12720 gccgcccgcc gaactccgcaccgcgctgcg cgagctggtg cgagcccgga ccgccgcggc 12780 gctcggcctc gacgacccggccgaggtcgc cgagggcgaa cggttccccg ccatgggctt 12840 cgactccctg gccaccgtacggctgcgccg cggactcgcc tcggccacgg gcctcgacct 12900 gccccccgat ctgctcttcgaccgggacac cccggccgcg ctcgccgccc acctggccga 12960 actgctcgcc accgcacgggaccacggacc cggcggcccc gggaccggtg ccgcgccggc 13020 cgatgccgga agcggcctgccggccctcta ccgggaggcc gtccgcaccg gccgggccgc 13080 ggaaatggcc gaactgctcgccgccgcttc ccggttccgc cccgccttcg ggacggcgga 13140 ccggcagccg gtggccctcgtgccgctggc cgacggcgcg gaggacaccg ggctcccgct 13200 gctcgtgggc tgcgccgggacggcggtggc ctccggcccg gtggagttca ccgccttcgc 13260 cggagcgctg gcggacctcccggcggcggc cccgatggcc gcgctgccgc agcccggctt 13320 tctgccggga gaacgagtcccggccacccc ggaggcattg ttcgaggccc aggcggaagc 13380 gctgctgcgc tacgcggccggccggccctt cgtgctgctg gggcactccg ccggcgccaa 13440 catggcccac gccctgacccgtcatctgga ggcgaacggt ggcggccccg cagggctggt 13500 gctcatggac atctacacccccgccgaccc cggcgcgatg ggcgtctggc ggaacgacat 13560 gttccagtgg gtctggcggcgctcggacat ccccccggac gaccaccgcc tcacggccat 13620 gggcgcctac caccggctgcttctcgactg gtcgcccacc cccgtccgcg cccccgtact 13680 gcatctgcgc gccgcggaacccatgggcga ctggccaccc ggggacaccg gctggcagtc 13740 ccactgggac ggcgcgcacaccaccgccgg catccccgga aaccacttca cgatgatgac 13800 cgaacacgcc tccgccgccgcccggctcgt gcacggctgg ctcgcggaac ggaccccgtc 13860 cgggcagggc gggtcaccgtcccgcgcggc ggggagagag gagaggccgt gaacacggca 13920 gccggcccga ccggcaccgccgccggcggc accaccgccc cggcggcggc acacgacctg 13980 tcccgcgccg gacgcaggctccaactcacc cgggccgcac agtggttcgc cggcaaccag 14040 ggagacccct acgggatgatcctgcgcgcc ggcaccgccg acccggcacc gtacgaggaa 14100 gagatccccg ggtaccgagctcgaattctt aattaaggag gtcgtagatg agtaacaaga 14160 acaacgatga gctgcagcggcaggcctcgg aaaacaccct ggggctgaac ccggtcatcg 14220 gtatccgccg caaagacctgttgagctcgg cacgcaccgt gctgcgccag gccgtgcgcc 14280 aaccgctgca cagcgccaagcatgtggccc actttggcct ggagctgaag aacgtgctgc 14340 tgggcaagtc cagccttgccccggaaagcg acgaccgtcg cttcaatgac ccggcatgga 14400 gcaacaaccc actttaccgccgctacctgc aaacctatct ggcctggcgc aaggagctgc 14460 aggactggat cggcaacagcgacctgtcgc cccaggacat cagccgcggc cagttcgtca 14520 tcaacctgat gaccgaagccatggctccga ccaacaccct gtccaacccg gcagcagtca 14580 aacgcttctt cgaaaccggcggcaagagcc tgctcgatgg cctgtccaac ctggccaagg 14640 acctggtcaa caacggtggcatgcccagcc aggtgaacat ggacgccttc gaggtgggca 14700 agaacctggg caccagtgaaggcgccgtgg tgtaccgcaa cgatgtgctg gagctgatcc 14760 agtacaagcc catcaccgagcaggtgcatg cccgcccgct gctggtggtg ccgccgcaga 14820 tcaacaagtt ctacgtattcgacctgagcc cggaaaagag cctggcacgc tactgcctgc 14880 gctcgcagca gcagaccttcatcatcagct ggcgcaaccc gaccaaagcc cagcgcgaat 14940 ggggcctgtc cacctacatcgacgcgctca aggaggcggt cgacgcggtg ctggcgatta 15000 ccggcagcaa ggacctgaacatgctcggtg cctgctccgg cggcatcacc tgcacggcat 15060 tggtcggcca ctatgccgccctcggcgaaa acaaggtcaa tgccctgacc ctgctggtca 15120 gcgtgctgga caccaccatggacaaccagg tcgccctgtt cgtcgacgag cagactttgg 15180 aggccgccaa gcgccactcctaccaggccg gtgtgctcga aggcagcgag atggccaagg 15240 tgttcgcctg gatgcgccccaacgacctga tctggaacta ctgggtcaac aactacctgc 15300 tcggcaacga gccgccggtgttcgacatcc tgttctggaa caacgacacc acgcgcctgc 15360 cggccgcctt ccacggcgacctgatcgaaa tgttcaagag caacccgctg acccgcccgg 15420 acgccctgga ggtttgcggcactccgatcg acctgaaaca ggtcaaatgc gacatctaca 15480 gccttgccgg caccaacgaccacatcaccc cgtggcagtc atgctaccgc tcggcgcacc 15540 tgttcggcgg caagatcgagttcgtgctgt ccaacagcgg ccacatccag agcatcctca 15600 acccgccagg caaccccaaggcgcgcttca tgaccggtgc cgatcgcccg ggtgacccgg 15660 tggcctggca ggaaaacgccaccaagcatg ccgactcctg gtggctgcac tggcaaagct 15720 ggctgggcga gcgtgccggcgagctggaaa aggcgccgac ccgcctgggc aaccgtgcct 15780 atgccgctgg cgaggcatccccgggcacct acgttcacga gcgttgagct gcagcgccgt 15840 ggccacctgc gggacgccacggtgttgaat tc 15872 2 5215 PRT Streptomyces venezuelae 2 Met Asn Glu AlaIle Ala Val Val Gly Met Ser Cys Arg Leu Pro Lys 1 5 10 15 Ala Ser AsnPro Ala Ala Phe Trp Glu Leu Leu Arg Asn Gly Glu Ser 20 25 30 Ala Val ThrAsp Val Pro Ser Gly Arg Trp Thr Ser Val Leu Gly Gly 35 40 45 Ala Asp AlaGlu Glu Pro Ala Glu Ser Gly Val Arg Arg Gly Gly Phe 50 55 60 Leu Asp SerLeu Asp Leu Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro 65 70 75 80 Arg GluAla Ala Ala Met Asp Pro Gln Gln Arg Leu Val Leu Glu Leu 85 90 95 Ala TrpGlu Ala Leu Glu Asp Ala Gly Ile Val Pro Gly Thr Leu Ala 100 105 110 GlySer Arg Thr Ala Val Phe Val Gly Thr Leu Arg Asp Asp Tyr Thr 115 120 125Ser Leu Leu Tyr Gln His Gly Glu Gln Ala Ile Thr Gln His Thr Met 130 135140 Ala Gly Val Asn Arg Gly Val Ile Ala Asn Arg Val Ser Tyr His Leu 145150 155 160 Gly Leu Gln Gly Pro Ser Leu Thr Val Asp Ala Ala Gln Ser SerSer 165 170 175 Leu Val Ala Val His Leu Ala Cys Glu Ser Leu Arg Ala GlyGlu Ser 180 185 190 Thr Thr Ala Leu Val Ala Gly Val Asn Leu Asn Ile LeuAla Glu Ser 195 200 205 Ala Val Thr Glu Glu Arg Phe Gly Gly Leu Ser ProAsp Gly Thr Ala 210 215 220 Tyr Thr Phe Asp Ala Arg Ala Asn Gly Phe ValArg Gly Glu Gly Gly 225 230 235 240 Gly Val Val Val Leu Lys Pro Leu SerArg Ala Leu Ala Asp Gly Asp 245 250 255 Arg Val His Gly Val Ile Arg AlaSer Ala Val Asn Asn Asp Gly Ala 260 265 270 Thr Pro Gly Leu Thr Val ProSer Arg Ala Ala Gln Glu Lys Val Leu 275 280 285 Arg Glu Ala Tyr Arg LysAla Ala Leu Asp Pro Ser Ala Val Gln Tyr 290 295 300 Val Glu Leu His GlyThr Gly Thr Pro Val Gly Asp Pro Ile Glu Ala 305 310 315 320 Ala Ala LeuGly Ala Val Leu Gly Ser Ala Arg Pro Ala Asp Glu Pro 325 330 335 Leu LeuVal Gly Ser Ala Lys Thr Asn Val Gly His Leu Glu Gly Ala 340 345 350 AlaGly Ile Val Gly Leu Ile Lys Thr Leu Leu Ala Leu Gly Arg Arg 355 360 365Arg Ile Pro Ala Ser Leu Asn Phe Arg Thr Pro His Pro Asp Ile Pro 370 375380 Leu Asp Thr Leu Gly Leu Asp Val Pro Asp Gly Leu Arg Glu Trp Pro 385390 395 400 His Pro Asp Arg Glu Leu Leu Ala Gly Val Ser Ser Phe Gly MetGly 405 410 415 Gly Thr Asn Ala His Val Val Leu Ser Glu Gly Pro Ala GlnGly Gly 420 425 430 Glu Gln Pro Gly Ile Asp Glu Glu Thr Pro Val Asp SerGly Ala Ala 435 440 445 Leu Pro Phe Val Val Thr Gly Arg Gly Gly Glu AlaLeu Arg Ala Gln 450 455 460 Ala Arg Arg Leu His Glu Ala Val Glu Ala AspPro Glu Leu Ala Pro 465 470 475 480 Ala Ala Leu Ala Arg Ser Leu Val ThrThr Arg Thr Val Phe Thr His 485 490 495 Arg Ser Val Val Leu Ala Pro AspArg Ala Arg Leu Leu Asp Gly Leu 500 505 510 Gly Ala Leu Ala Ala Gly ThrPro Ala Pro Gly Val Val Thr Gly Thr 515 520 525 Pro Ala Pro Gly Arg LeuAla Val Leu Phe Ser Gly Gln Gly Ala Gln 530 535 540 Arg Thr Gly Met GlyMet Glu Leu Tyr Ala Ala His Pro Ala Phe Ala 545 550 555 560 Thr Ala PheAsp Ala Val Ala Ala Glu Leu Asp Pro Leu Leu Asp Arg 565 570 575 Pro LeuAla Glu Leu Val Ala Ala Gly Asp Thr Leu Asp Arg Thr Val 580 585 590 HisThr Gln Pro Ala Leu Phe Ala Val Glu Val Ala Leu His Arg Leu 595 600 605Val Glu Ser Trp Gly Val Thr Pro Asp Leu Leu Ala Gly His Ser Val 610 615620 Gly Glu Ile Ser Ala Ala His Val Ala Gly Val Leu Ser Leu Arg Asp 625630 635 640 Ala Ala Arg Leu Val Ala Ala Arg Gly Arg Leu Met Gln Ala LeuPro 645 650 655 Glu Gly Gly Ala Met Val Ala Val Glu Ala Ser Glu Glu GluVal Leu 660 665 670 Pro His Leu Ala Gly Arg Glu Arg Glu Leu Ser Leu AlaAla Val Asn 675 680 685 Gly Pro Arg Ala Val Val Leu Ala Gly Ala Glu ArgAla Val Leu Asp 690 695 700 Val Ala Glu Leu Leu Arg Glu Gln Gly Arg ArgThr Lys Arg Leu Ser 705 710 715 720 Val Ser His Ala Phe His Ser Pro LeuMet Glu Pro Met Leu Asp Asp 725 730 735 Phe Arg Arg Val Val Glu Glu LeuAsp Phe Gln Glu Pro Arg Val Asp 740 745 750 Val Val Ser Thr Val Thr GlyLeu Pro Val Thr Ala Gly Gln Trp Thr 755 760 765 Asp Pro Glu Tyr Trp ValAsp Gln Val Arg Arg Pro Val Arg Phe Leu 770 775 780 Asp Ala Val Arg ThrLeu Glu Glu Ser Gly Ala Asp Thr Phe Leu Glu 785 790 795 800 Leu Gly ProAsp Gly Val Cys Ser Ala Met Ala Ala Asp Ser Val Arg 805 810 815 Asp GlnGlu Ala Ala Thr Ala Val Ser Ala Leu Arg Lys Gly Arg Pro 820 825 830 GluPro Gln Ser Leu Leu Ala Ala Leu Thr Thr Val Phe Val Arg Gly 835 840 845His Asp Val Asp Trp Thr Ala Ala His Gly Ser Thr Gly Thr Val Arg 850 855860 Val Pro Leu Pro Thr Tyr Ala Phe Gln Arg Glu Arg His Trp Phe Asp 865870 875 880 Gly Ala Ala Arg Thr Ala Ala Pro Leu Thr Ala Gly Arg Ser GlyThr 885 890 895 Gly Ala Gly Thr Gly Pro Ala Ala Gly Val Thr Ser Gly GluGly Glu 900 905 910 Gly Glu Gly Glu Gly Ala Gly Ala Gly Gly Gly Asp ArgPro Ala Arg 915 920 925 His Glu Thr Thr Glu Arg Val Arg Ala His Val AlaAla Val Leu Glu 930 935 940 Tyr Asp Asp Pro Thr Arg Val Glu Leu Gly LeuThr Phe Lys Glu Leu 945 950 955 960 Gly Phe Asp Ser Leu Met Ser Val GluLeu Arg Asn Ala Leu Val Asp 965 970 975 Asp Thr Gly Leu Arg Leu Pro SerGly Leu Leu Phe Asp His Pro Thr 980 985 990 Pro Arg Ala Leu Ala Ala HisLeu Gly Asp Leu Leu Thr Gly Gly Ser 995 1000 1005 Gly Glu Thr Gly SerAla Asp Gly Ile Pro Pro Ala Thr Pro Ala Asp 1010 1015 1020 Thr Thr AlaGlu Pro Ile Ala Ile Ile Gly Met Ala Cys Arg Tyr Pro 1025 1030 1035 1040Gly Gly Val Thr Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Glu Gly 10451050 1055 Arg Asp Ala Val Ser Gly Leu Pro Thr Asp Arg Gly Trp Asp GluAsp 1060 1065 1070 Leu Phe Asp Ala Asp Pro Asp Arg Ser Gly Lys Ser SerVal Arg Glu 1075 1080 1085 Gly Gly Phe Leu His Asp Ala Ala Leu Phe AspAla Gly Phe Phe Gly 1090 1095 1100 Ile Ser Pro Arg Glu Ala Leu Gly MetAsp Pro Gln Gln Arg Leu Leu 1105 1110 1115 1120 Leu Glu Thr Ala Trp GluAla Val Glu Arg Ala Gly Leu Asp Pro Glu 1125 1130 1135 Gly Leu Lys GlySer Arg Thr Ala Val Phe Val Gly Ala Thr Ala Leu 1140 1145 1150 Asp TyrGly Pro Arg Met His Asp Gly Ala Glu Gly Val Glu Gly His 1155 1160 1165Leu Leu Thr Gly Thr Thr Pro Ser Val Met Ser Gly Arg Ile Ala Tyr 11701175 1180 Gln Leu Gly Leu Thr Gly Pro Ala Val Thr Val Asp Thr Ala CysSer 1185 1190 1195 1200 Ser Ser Leu Val Ala Leu His Leu Ala Val Arg SerLeu Arg Gln Gly 1205 1210 1215 Glu Ser Ser Leu Ala Leu Ala Gly Gly AlaThr Val Met Ser Thr Pro 1220 1225 1230 Gly Met Phe Val Glu Phe Ser ArgGln Arg Gly Leu Ala Ala Asp Gly 1235 1240 1245 Arg Ser Lys Ala Phe SerAsp Ser Ala Asp Gly Thr Ser Trp Ala Glu 1250 1255 1260 Gly Val Gly LeuLeu Val Val Glu Arg Leu Ser Asp Ala Glu Arg Asn 1265 1270 1275 1280 GlyHis Pro Val Leu Ala Val Ile Arg Gly Ser Ala Val Asn Gln Asp 1285 12901295 Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg1300 1305 1310 Val Ile Arg Gln Ala Leu Ala Asp Ala Gly Leu Thr Pro AlaAsp Val 1315 1320 1325 Asp Ala Val Glu Ala His Gly Thr Gly Thr Arg LeuGly Asp Pro Ile 1330 1335 1340 Glu Ala Glu Ala Ile Leu Gly Thr Tyr GlyArg Asp Arg Gly Glu Gly 1345 1350 1355 1360 Ala Pro Leu Gln Leu Gly SerLeu Lys Ser Asn Ile Gly His Ala Gln 1365 1370 1375 Ala Ala Ala Gly ValGly Gly Leu Ile Lys Met Val Leu Ala Met Arg 1380 1385 1390 His Gly ValLeu Pro Arg Thr Leu His Val Asp Arg Pro Thr Thr Arg 1395 1400 1405 ValAsp Trp Glu Ala Gly Gly Val Glu Leu Leu Thr Glu Glu Arg Glu 1410 14151420 Trp Pro Glu Thr Gly Arg Pro Arg Arg Ala Ala Ile Ser Ser Phe Gly1425 1430 1435 1440 Ile Ser Gly Thr Asn Ala His Ile Val Val Glu Gln AlaPro Glu Ala 1445 1450 1455 Gly Glu Ala Ala Val Thr Thr Thr Ala Pro GluAla Gly Glu Ala Gly 1460 1465 1470 Glu Ala Ala Asp Thr Thr Ala Thr ThrThr Pro Ala Ala Val Gly Val 1475 1480 1485 Pro Glu Pro Val Arg Ala ProVal Val Val Ser Ala Arg Asp Ala Ala 1490 1495 1500 Ala Leu Arg Ala GlnAla Val Arg Leu Arg Thr Phe Leu Asp Gly Arg 1505 1510 1515 1520 Pro AspVal Thr Val Ala Asp Leu Gly Arg Ser Leu Ala Ala Arg Thr 1525 1530 1535Ala Phe Glu His Lys Ala Ala Leu Thr Thr Ala Thr Arg Asp Glu Leu 15401545 1550 Leu Ala Gly Leu Asp Ala Leu Gly Arg Gly Glu Gln Ala Thr GlyLeu 1555 1560 1565 Val Thr Gly Glu Pro Ala Arg Ala Gly Arg Thr Ala PheLeu Phe Thr 1570 1575 1580 Gly Gln Gly Ala Gln Arg Val Ala Met Gly GluGlu Leu Arg Ala Ala 1585 1590 1595 1600 His Pro Val Phe Ala Ala Ala LeuAsp Thr Val Tyr Ala Ala Leu Asp 1605 1610 1615 Arg His Leu Asp Arg ProLeu Arg Glu Ile Val Ala Ala Gly Glu Glu 1620 1625 1630 Leu Asp Leu ThrAla Tyr Thr Gln Pro Ala Leu Phe Ala Phe Glu Val 1635 1640 1645 Ala LeuPhe Arg Leu Leu Glu His His Gly Leu Val Pro Asp Leu Leu 1650 1655 1660Thr Gly His Ser Val Gly Glu Ile Ala Ala Ala His Val Ala Gly Val 16651670 1675 1680 Leu Ser Leu Asp Asp Ala Ala Arg Leu Val Thr Ala Arg GlyArg Leu 1685 1690 1695 Met Gln Ser Ala Arg Glu Gly Gly Ala Met Ile AlaVal Gln Ala Gly 1700 1705 1710 Glu Ala Glu Val Val Glu Ser Leu Lys GlyTyr Glu Gly Arg Val Ala 1715 1720 1725 Val Ala Ala Val Asn Gly Pro ThrAla Val Val Val Ser Gly Asp Ala 1730 1735 1740 Asp Ala Ala Glu Glu IleArg Ala Val Trp Ala Gly Arg Gly Arg Arg 1745 1750 1755 1760 Thr Arg ArgLeu Arg Val Ser His Ala Phe His Ser Pro His Met Asp 1765 1770 1775 AspVal Leu Asp Glu Phe Leu Arg Val Ala Glu Gly Leu Thr Phe Glu 1780 17851790 Glu Pro Arg Ile Pro Val Val Ser Thr Val Thr Gly Ala Leu Val Thr1795 1800 1805 Ser Gly Glu Leu Thr Ser Pro Ala Tyr Trp Val Asp Gln IleArg Arg 1810 1815 1820 Pro Val Arg Phe Leu Asp Ala Val Arg Thr Leu AlaAla Gln Asp Ala 1825 1830 1835 1840 Thr Val Leu Val Glu Ile Gly Pro AspAla Val Leu Thr Ala Leu Ala 1845 1850 1855 Glu Glu Ala Leu Ala Pro GlyThr Asp Ala Pro Asp Ala Arg Asp Val 1860 1865 1870 Thr Val Val Pro LeuLeu Arg Ala Gly Arg Pro Glu Pro Glu Thr Leu 1875 1880 1885 Ala Ala GlyLeu Ala Thr Ala His Val His Gly Ala Pro Leu Asp Arg 1890 1895 1900 AlaSer Phe Phe Pro Asp Gly Arg Arg Thr Asp Leu Pro Thr Tyr Ala 1905 19101915 1920 Phe Arg Arg Glu His Tyr Trp Leu Thr Pro Glu Ala Arg Thr AspAla 1925 1930 1935 Arg Ala Leu Gly Phe Asp Pro Ala Arg His Pro Leu LeuThr Thr Thr 1940 1945 1950 Val Glu Val Ala Gly Gly Asp Gly Val Leu LeuThr Gly Arg Leu Ser 1955 1960 1965 Leu Thr Asp Gln Pro Trp Leu Ala AspHis Met Val Asn Gly Ala Val 1970 1975 1980 Leu Leu Pro Ala Thr Ala PheLeu Glu Leu Ala Leu Ala Ala Gly Asp 1985 1990 1995 2000 His Val Gly AlaVal Arg Val Glu Glu Leu Thr Leu Glu Ala Pro Leu 2005 2010 2015 Val LeuPro Glu Arg Gly Ala Val Arg Ile Gln Val Gly Val Ser Gly 2020 2025 2030Asp Gly Glu Ser Pro Ala Gly Arg Thr Phe Gly Val Tyr Ser Thr Pro 20352040 2045 Asp Ser Gly Asp Thr Gly Asp Asp Ala Pro Arg Glu Trp Thr ArgHis 2050 2055 2060 Val Ser Gly Val Leu Gly Glu Gly Asp Pro Ala Thr GluSer Asp His 2065 2070 2075 2080 Pro Gly Thr Asp Gly Asp Gly Ser Ala AlaTrp Pro Pro Ala Ala Ala 2085 2090 2095 Thr Ala Thr Pro Leu Asp Gly ValTyr Asp Arg Leu Ala Glu Leu Gly 2100 2105 2110 Tyr Gly Tyr Gly Pro AlaPhe Gln Gly Leu Thr Gly Leu Trp Arg Asp 2115 2120 2125 Gly Ala Asp ThrLeu Ala Glu Ile Arg Leu Pro Ala Ala Gln His Glu 2130 2135 2140 Ser AlaGly Leu Phe Gly Val His Pro Ala Leu Leu Asp Ala Ala Leu 2145 2150 21552160 His Pro Ile Val Leu Glu Gly Asn Ser Ala Ala Gly Ala Cys Asp Ala2165 2170 2175 Asp Thr Asp Ala Thr Asp Arg Ile Arg Leu Pro Phe Ala TrpAla Gly 2180 2185 2190 Val Thr Leu His Ala Glu Gly Ala Thr Ala Leu ArgVal Arg Ile Thr 2195 2200 2205 Pro Thr Gly Pro Asp Thr Val Thr Leu ArgLeu Thr Asp Thr Thr Gly 2210 2215 2220 Ala Pro Val Ala Thr Val Glu SerLeu Thr Leu Arg Ala Val Ala Lys 2225 2230 2235 2240 Asp Arg Leu Gly ThrThr Ala Gly Arg Val Asp Asp Ala Leu Phe Thr 2245 2250 2255 Val Val TrpThr Glu Thr Gly Thr Pro Glu Pro Ala Gly Arg Gly Ala 2260 2265 2270 ValGlu Val Glu Glu Leu Val Asp Leu Ala Gly Leu Gly Asp Leu Val 2275 22802285 Glu Leu Gly Ala Ala Asp Val Val Leu Arg Ala Asp Arg Trp Thr Leu2290 2295 2300 Asp Gly Asp Pro Ser Ala Ala Ala Arg Thr Ala Val Arg ArgThr Leu 2305 2310 2315 2320 Ala Ile Val Gln Glu Phe Leu Ser Glu Pro ArgPhe Asp Gly Ser Arg 2325 2330 2335 Leu Val Cys Val Thr Arg Gly Ala ValAla Ala Leu Pro Gly Glu Asp 2340 2345 2350 Val Thr Ser Leu Ala Thr GlyPro Leu Trp Gly Leu Val Arg Ser Ala 2355 2360 2365 Gln Ser Glu Asn ProGly Arg Leu Phe Leu Leu Asp Leu Gly Glu Gly 2370 2375 2380 Glu Gly GluArg Asp Gly Ala Glu Glu Leu Ile Arg Ala Ala Thr Ala 2385 2390 2395 2400Gly Asp Glu Pro Gln Leu Ala Ala Arg Asp Gly Arg Leu Leu Ala Pro 24052410 2415 Arg Leu Ala Arg Thr Ala Ala Leu Ser Ser Glu Asp Thr Ala GlyGly 2420 2425 2430 Ala Asp Arg Phe Gly Pro Asp Gly Thr Val Leu Val ThrGly Gly Thr 2435 2440 2445 Gly Gly Leu Gly Ala Leu Leu Ala Arg His LeuVal Glu Arg His Gly 2450 2455 2460 Val Arg Arg Leu Leu Leu Val Ser ArgArg Gly Ala Asp Ala Pro Gly 2465 2470 2475 2480 Ala Ala Asp Leu Gly GluAsp Leu Ala Gly Leu Gly Ala Glu Val Ala 2485 2490 2495 Phe Ala Ala AlaAsp Ala Ala Asp Arg Glu Ser Leu Ala Arg Ala Ile 2500 2505 2510 Ala ThrVal Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala 2515 2520 2525Gly Val Val Asp Asp Ala Thr Val Glu Ala Leu Thr Pro Glu Arg Leu 25302535 2540 Asp Ala Val Leu Arg Pro Lys Val Asp Ala Ala Trp Asn Leu HisGlu 2545 2550 2555 2560 Leu Thr Lys Asp Leu Arg Leu Asp Ala Phe Val LeuPhe Ser Ser Val 2565 2570 2575 Ser Gly Ile Val Gly Thr Ala Gly Gln AlaAsn Tyr Ala Ala Ala Asn 2580 2585 2590 Thr Gly Leu Asp Ala Leu Ala AlaHis Arg Ala Ala Thr Gly Leu Ala 2595 2600 2605 Ala Thr Ser Leu Ala TrpGly Leu Trp Asp Gly Thr His Gly Met Gly 2610 2615 2620 Gly Thr Leu GlyAla Ala Asp Leu Ala Arg Trp Ser Arg Ala Gly Ile 2625 2630 2635 2640 ThrPro Leu Thr Pro Leu Gln Gly Leu Ala Leu Phe Asp Ala Ala Val 2645 26502655 Ala Arg Asp Asp Ala Leu Leu Val Pro Ala Gly Leu Arg Pro Thr Ala2660 2665 2670 His Arg Gly Thr Asp Gly Gln Pro Pro Ala Leu Trp Arg GlyLeu Val 2675 2680 2685 Arg Ala Arg Pro Arg Arg Ala Ala Arg Thr Ala AlaGlu Ala Ala Asp 2690 2695 2700 Thr Thr Gly Gly Trp Leu Ser Gly Leu AlaAla Gln Ser Pro Glu Glu 2705 2710 2715 2720 Arg Arg Ser Thr Ala Val ThrLeu Val Thr Gly Val Val Ala Asp Val 2725 2730 2735 Leu Gly His Ala AspSer Ala Ala Val Gly Ala Glu Arg Ser Phe Lys 2740 2745 2750 Asp Leu GlyPhe Asp Ser Leu Ala Gly Val Glu Leu Arg Asn Arg Leu 2755 2760 2765 AsnAla Ala Thr Gly Leu Arg Leu Pro Ala Thr Thr Val Phe Asp His 2770 27752780 Pro Ser Pro Ala Ala Leu Ala Ser His Leu Leu Ala Gln Val Pro Gly2785 2790 2795 2800 Leu Lys Glu Gly Thr Ala Ala Thr Ala Thr Val Val AlaGlu Arg Gly 2805 2810 2815 Ala Ser Phe Gly Asp Arg Ala Thr Asp Asp AspPro Ile Ala Ile Val 2820 2825 2830 Gly Met Ala Cys Arg Tyr Pro Gly GlyVal Ser Ser Pro Glu Asp Leu 2835 2840 2845 Trp Arg Leu Val Ala Glu GlyThr Asp Ala Ile Ser Glu Phe Pro Val 2850 2855 2860 Asn Arg Gly Trp AspLeu Glu Ser Leu Tyr Asp Pro Asp Pro Glu Ser 2865 2870 2875 2880 Lys GlyThr Thr Tyr Cys Arg Glu Gly Gly Phe Leu Glu Gly Ala Gly 2885 2890 2895Asp Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Val 29002905 2910 Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Val Ser Trp Glu AlaLeu 2915 2920 2925 Glu Arg Ala Gly Ile Asp Pro Ser Ser Leu Arg Gly SerArg Gly Gly 2930 2935 2940 Val Tyr Val Gly Ala Ala His Gly Ser Tyr AlaSer Asp Pro Arg Leu 2945 2950 2955 2960 Val Pro Glu Gly Ser Glu Gly TyrLeu Leu Thr Gly Ser Ala Asp Ala 2965 2970 2975 Val Met Ser Gly Arg IleSer Tyr Ala Leu Gly Leu Glu Gly Pro Ser 2980 2985 2990 Met Thr Val GluThr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 2995 3000 3005 Ala ValArg Ala Leu Arg His Gly Glu Cys Gly Leu Ala Leu Ala Gly 3010 3015 3020Gly Val Ala Val Met Ala Asp Pro Ala Ala Phe Val Glu Phe Ser Arg 30253030 3035 3040 Gln Lys Gly Leu Ala Ala Asp Gly Arg Cys Lys Ala Phe SerAla Ala 3045 3050 3055 Ala Asp Gly Thr Gly Trp Ala Glu Gly Val Gly ValLeu Val Leu Glu 3060 3065 3070 Arg Leu Ser Asp Ala Arg Arg Ala Gly HisThr Val Leu Gly Leu Val 3075 3080 3085 Thr Gly Thr Ala Val Asn Gln AspGly Ala Ser Asn Gly Leu Thr Ala 3090 3095 3100 Pro Asn Gly Pro Ala GlnGln Arg Val Ile Ala Glu Ala Leu Ala Asp 3105 3110 3115 3120 Ala Gly LeuSer Pro Glu Asp Val Asp Ala Val Glu Ala His Gly Thr 3125 3130 3135 GlyThr Arg Leu Gly Asp Pro Ile Glu Ala Gly Ala Leu Leu Ala Ala 3140 31453150 Ser Gly Arg Asn Arg Ser Gly Asp His Pro Leu Trp Leu Gly Ser Leu3155 3160 3165 Lys Ser Asn Ile Gly His Ala Gln Ala Ala Ala Gly Val GlyGly Val 3170 3175 3180 Ile Lys Met Leu Gln Ala Leu Arg His Gly Leu LeuPro Arg Thr Leu 3185 3190 3195 3200 His Ala Asp Glu Pro Thr Pro His AlaAsp Trp Ser Ser Gly Arg Val 3205 3210 3215 Arg Leu Leu Thr Ser Glu ValPro Trp Gln Arg Thr Gly Arg Pro Arg 3220 3225 3230 Arg Thr Gly Val SerAla Phe Gly Val Gly Gly Thr Asn Ala His Val 3235 3240 3245 Val Leu GluGlu Ala Pro Ala Pro Pro Ala Pro Glu Pro Ala Gly Glu 3250 3255 3260 AlaPro Gly Gly Ser Arg Ala Ala Glu Gly Ala Glu Gly Pro Leu Ala 3265 32703275 3280 Trp Val Val Ser Gly Arg Asp Glu Pro Ala Leu Arg Ser Gln AlaArg 3285 3290 3295 Arg Leu Arg Asp His Leu Ser Arg Thr Pro Gly Ala ArgPro Arg Asp 3300 3305 3310 Ile Ala Phe Ser Leu Ala Ala Thr Arg Ala AlaPhe Asp His Arg Ala 3315 3320 3325 Val Leu Ile Gly Ser Asp Gly Ala GluLeu Ala Ala Ala Leu Asp Ala 3330 3335 3340 Leu Ala Glu Gly Arg Asp GlyPro Ala Val Val Arg Gly Val Arg Asp 3345 3350 3355 3360 Arg Asp Gly ArgMet Ala Phe Leu Phe Thr Gly Gln Gly Ser Gln Arg 3365 3370 3375 Ala GlyMet Ala His Asp Leu His Ala Ala His Thr Phe Phe Ala Ser 3380 3385 3390Ala Leu Asp Glu Val Thr Asp Arg Leu Asp Pro Leu Leu Gly Arg Pro 33953400 3405 Leu Gly Ala Leu Leu Asp Ala Arg Pro Gly Ser Pro Glu Ala AlaLeu 3410 3415 3420 Leu Asp Arg Thr Glu Tyr Thr Gln Pro Ala Leu Phe AlaVal Glu Val 3425 3430 3435 3440 Ala Leu His Arg Leu Leu Glu His Trp GlyMet Arg Pro Asp Leu Leu 3445 3450 3455 Leu Gly His Ser Val Gly Glu LeuAla Ala Ala His Val Ala Gly Val 3460 3465 3470 Leu Asp Leu Asp Asp AlaCys Ala Leu Val Ala Ala Arg Gly Arg Leu 3475 3480 3485 Met Gln Arg LeuPro Pro Gly Gly Ala Met Val Ser Val Arg Ala Gly 3490 3495 3500 Glu AspGlu Val Arg Ala Leu Leu Ala Gly Arg Glu Asp Ala Val Cys 3505 3510 35153520 Val Ala Ala Val Asn Gly Pro Arg Ser Val Val Ile Ser Gly Ala Glu3525 3530 3535 Glu Ala Val Ala Glu Ala Ala Ala Gln Leu Ala Gly Arg GlyArg Arg 3540 3545 3550 Thr Arg Arg Leu Arg Val Ala His Ala Phe His SerPro Leu Met Asp 3555 3560 3565 Gly Met Leu Ala Gly Phe Arg Glu Val AlaAla Gly Leu Arg Tyr Arg 3570 3575 3580 Glu Pro Glu Leu Thr Val Val SerThr Val Thr Gly Arg Pro Ala Arg 3585 3590 3595 3600 Pro Gly Glu Leu ThrGly Pro Asp Tyr Trp Val Ala Gln Val Arg Glu 3605 3610 3615 Pro Val ArgPhe Ala Asp Ala Val Arg Thr Ala His Arg Leu Gly Ala 3620 3625 3630 ArgThr Phe Leu Glu Thr Gly Pro Asp Gly Val Leu Cys Gly Met Ala 3635 36403645 Glu Glu Cys Leu Glu Asp Asp Thr Val Ala Leu Leu Pro Ala Ile His3650 3655 3660 Lys Pro Gly Thr Ala Pro His Gly Pro Ala Ala Pro Gly AlaLeu Arg 3665 3670 3675 3680 Ala Ala Ala Ala Ala Tyr Gly Arg Gly Ala ArgVal Asp Trp Ala Gly 3685 3690 3695 Met His Ala Asp Gly Pro Glu Gly ProAla Arg Arg Val Glu Leu Pro 3700 3705 3710 Val His Ala Phe Arg His ArgArg Tyr Trp Leu Ala Pro Gly Arg Ala 3715 3720 3725 Ala Asp Thr Asp AspTrp Met Tyr Arg Ile Gly Trp Asp Arg Leu Pro 3730 3735 3740 Ala Val ThrGly Gly Ala Arg Thr Ala Gly Arg Trp Leu Val Ile His 3745 3750 3755 3760Pro Asp Ser Pro Arg Cys Arg Glu Leu Ser Gly His Ala Glu Arg Ala 37653770 3775 Leu Arg Ala Ala Gly Ala Ser Pro Val Pro Leu Pro Val Asp AlaPro 3780 3785 3790 Ala Ala Asp Arg Ala Ser Phe Ala Ala Leu Leu Arg SerAla Thr Gly 3795 3800 3805 Pro Asp Thr Arg Gly Asp Thr Ala Ala Pro ValAla Gly Val Leu Ser 3810 3815 3820 Leu Leu Ser Glu Glu Asp Arg Pro HisArg Gln His Ala Pro Val Pro 3825 3830 3835 3840 Ala Gly Val Leu Ala ThrLeu Ser Leu Met Gln Ala Met Glu Glu Glu 3845 3850 3855 Ala Val Glu AlaArg Val Trp Cys Val Ser Arg Ala Ala Val Ala Ala 3860 3865 3870 Ala AspArg Glu Arg Pro Val Gly Ala Gly Ala Ala Leu Trp Gly Leu 3875 3880 3885Gly Arg Val Ala Ala Leu Glu Arg Pro Thr Arg Trp Gly Gly Leu Val 38903895 3900 Asp Leu Pro Ala Ser Pro Gly Ala Ala His Trp Ala Ala Ala ValGlu 3905 3910 3915 3920 Arg Leu Ala Gly Pro Glu Asp Gln Ile Ala Val ArgAla Ser Gly Ser 3925 3930 3935 Trp Gly Arg Arg Leu Thr Arg Leu Pro ArgAsp Gly Gly Gly Arg Thr 3940 3945 3950 Ala Ala Pro Ala Tyr Arg Pro ArgGly Thr Val Leu Val Thr Gly Gly 3955 3960 3965 Thr Gly Ala Leu Gly GlyHis Leu Ala Arg Trp Leu Ala Ala Ala Gly 3970 3975 3980 Ala Glu His LeuAla Leu Thr Ser Arg Arg Gly Pro Asp Ala Pro Gly 3985 3990 3995 4000 AlaAla Gly Leu Glu Ala Glu Leu Leu Leu Leu Gly Ala Lys Val Thr 4005 40104015 Phe Ala Ala Cys Asp Thr Ala Asp Arg Asp Gly Leu Ala Arg Val Leu4020 4025 4030 Arg Ala Ile Pro Glu Asp Thr Pro Leu Thr Ala Val Phe HisAla Ala 4035 4040 4045 Gly Val Pro Gln Val Thr Pro Leu Ser Arg Thr SerPro Glu His Phe 4050 4055 4060 Ala Asp Val Tyr Ala Gly Lys Ala Ala GlyAla Ala His Leu Asp Glu 4065 4070 4075 4080 Leu Thr Arg Glu Leu Gly AlaGly Leu Asp Ala Phe Val Leu Tyr Ser 4085 4090 4095 Ser Gly Ala Gly ValTrp Gly Ser Ala Gly Gln Gly Ala Tyr Ala Ala 4100 4105 4110 Ala Asn AlaAla Leu Asp Ala Leu Ala Arg Arg Arg Ala Ala Asp Gly 4115 4120 4125 LeuPro Ala Thr Ser Ile Ala Trp Gly Val Trp Gly Gly Gly Gly Met 4130 41354140 Gly Ala Asp Glu Ala Gly Ala Glu Tyr Leu Gly Arg Arg Gly Met Arg4145 4150 4155 4160 Pro Met Ala Pro Val Ser Ala Leu Arg Ala Met Ala ThrAla Ile Ala 4165 4170 4175 Ser Gly Glu Pro Cys Pro Thr Val Thr His ThrAsp Trp Glu Arg Phe 4180 4185 4190 Gly Glu Gly Phe Thr Ala Phe Arg ProSer Pro Leu Ile Ala Gly Leu 4195 4200 4205 Gly Thr Pro Gly Gly Gly ArgAla Ala Glu Thr Pro Glu Glu Gly Asn 4210 4215 4220 Ala Thr Ala Ala AlaAsp Leu Thr Ala Leu Pro Pro Ala Glu Leu Arg 4225 4230 4235 4240 Thr AlaLeu Arg Glu Leu Val Arg Ala Arg Thr Ala Ala Ala Leu Gly 4245 4250 4255Leu Asp Asp Pro Ala Glu Val Ala Glu Gly Glu Arg Phe Pro Ala Met 42604265 4270 Gly Phe Asp Ser Leu Ala Thr Val Arg Leu Arg Arg Gly Leu AlaSer 4275 4280 4285 Ala Thr Gly Leu Asp Leu Pro Pro Asp Leu Leu Phe AspArg Asp Thr 4290 4295 4300 Pro Ala Ala Leu Ala Ala His Leu Ala Glu LeuLeu Ala Thr Ala Arg 4305 4310 4315 4320 Asp His Gly Pro Gly Gly Pro GlyThr Gly Ala Ala Pro Ala Asp Ala 4325 4330 4335 Gly Ser Gly Leu Pro AlaLeu Tyr Arg Glu Ala Val Arg Thr Gly Arg 4340 4345 4350 Ala Ala Glu MetAla Glu Leu Leu Ala Ala Ala Ser Arg Phe Arg Pro 4355 4360 4365 Ala PheGly Thr Ala Asp Arg Gln Pro Val Ala Leu Val Pro Leu Ala 4370 4375 4380Asp Gly Ala Glu Asp Thr Gly Leu Pro Leu Leu Val Gly Cys Ala Gly 43854390 4395 4400 Thr Ala Val Ala Ser Gly Pro Val Glu Phe Thr Ala Phe AlaGly Ala 4405 4410 4415 Leu Ala Asp Leu Pro Ala Ala Ala Pro Met Ala AlaLeu Pro Gln Pro 4420 4425 4430 Gly Phe Leu Pro Gly Glu Arg Val Pro AlaThr Pro Glu Ala Leu Phe 4435 4440 4445 Glu Ala Gln Ala Glu Ala Leu LeuArg Tyr Ala Ala Gly Arg Pro Phe 4450 4455 4460 Val Leu Leu Gly His SerAla Gly Ala Asn Met Ala His Ala Leu Thr 4465 4470 4475 4480 Arg His LeuGlu Ala Asn Gly Gly Gly Pro Ala Gly Leu Val Leu Met 4485 4490 4495 AspIle Tyr Thr Pro Ala Asp Pro Gly Ala Met Gly Val Trp Arg Asn 4500 45054510 Asp Met Phe Gln Trp Val Trp Arg Arg Ser Asp Ile Pro Pro Asp Asp4515 4520 4525 His Arg Leu Thr Ala Met Gly Ala Tyr His Arg Leu Leu LeuAsp Trp 4530 4535 4540 Ser Pro Thr Pro Val Arg Ala Pro Val Leu His LeuArg Ala Ala Glu 4545 4550 4555 4560 Pro Met Gly Asp Trp Pro Pro Gly AspThr Gly Trp Gln Ser His Trp 4565 4570 4575 Asp Gly Ala His Thr Thr AlaGly Ile Pro Gly Asn His Phe Thr Met 4580 4585 4590 Met Thr Glu His AlaSer Ala Ala Ala Arg Leu Val His Gly Trp Leu 4595 4600 4605 Ala Glu ArgThr Pro Ser Gly Gln Gly Gly Ser Pro Ser Arg Ala Ala 4610 4615 4620 GlyArg Glu Glu Arg Pro Met Ile Leu Arg Ala Gly Thr Ala Asp Pro 4625 46304635 4640 Ala Pro Tyr Glu Glu Glu Ile Pro Gly Tyr Arg Ala Arg Ile LeuAsn 4645 4650 4655 Met Ser Asn Lys Asn Asn Asp Glu Leu Gln Arg Gln AlaSer Glu Asn 4660 4665 4670 Thr Leu Gly Leu Asn Pro Val Ile Gly Ile ArgArg Lys Asp Leu Leu 4675 4680 4685 Ser Ser Ala Arg Thr Val Leu Arg GlnAla Val Arg Gln Pro Leu His 4690 4695 4700 Ser Ala Lys His Val Ala HisPhe Gly Leu Glu Leu Lys Asn Val Leu 4705 4710 4715 4720 Leu Gly Lys SerSer Leu Ala Pro Glu Ser Asp Asp Arg Arg Phe Asn 4725 4730 4735 Asp ProAla Trp Ser Asn Asn Pro Leu Tyr Arg Arg Tyr Leu Gln Thr 4740 4745 4750Tyr Leu Ala Trp Arg Lys Glu Leu Gln Asp Trp Ile Gly Asn Ser Asp 47554760 4765 Leu Ser Pro Gln Asp Ile Ser Arg Gly Gln Phe Val Ile Asn LeuMet 4770 4775 4780 Thr Glu Ala Met Ala Pro Thr Asn Thr Leu Ser Asn ProAla Ala Val 4785 4790 4795 4800 Lys Arg Phe Phe Glu Thr Gly Gly Lys SerLeu Leu Asp Gly Leu Ser 4805 4810 4815 Asn Leu Ala Lys Asp Leu Val AsnAsn Gly Gly Met Pro Ser Gln Val 4820 4825 4830 Asn Met Asp Ala Phe GluVal Gly Lys Asn Leu Gly Thr Ser Glu Gly 4835 4840 4845 Ala Val Val TyrArg Asn Asp Val Leu Glu Leu Ile Gln Tyr Lys Pro 4850 4855 4860 Ile ThrGlu Gln Val His Ala Arg Pro Leu Leu Val Val Pro Pro Gln 4865 4870 48754880 Ile Asn Lys Phe Tyr Val Phe Asp Leu Ser Pro Glu Lys Ser Leu Ala4885 4890 4895 Arg Tyr Cys Leu Arg Ser Gln Gln Gln Thr Phe Ile Ile SerTrp Arg 4900 4905 4910 Asn Pro Thr Lys Ala Gln Arg Glu Trp Gly Leu SerThr Tyr Ile Asp 4915 4920 4925 Ala Leu Lys Glu Ala Val Asp Ala Val LeuAla Ile Thr Gly Ser Lys 4930 4935 4940 Asp Leu Asn Met Leu Gly Ala CysSer Gly Gly Ile Thr Cys Thr Ala 4945 4950 4955 4960 Leu Val Gly His TyrAla Ala Leu Gly Glu Asn Lys Val Asn Ala Leu 4965 4970 4975 Thr Leu LeuVal Ser Val Leu Asp Thr Thr Met Asp Asn Gln Val Ala 4980 4985 4990 LeuPhe Val Asp Glu Gln Thr Leu Glu Ala Ala Lys Arg His Ser Tyr 4995 50005005 Gln Ala Gly Val Leu Glu Gly Ser Glu Met Ala Lys Val Phe Ala Trp5010 5015 5020 Met Arg Pro Asn Asp Leu Ile Trp Asn Tyr Trp Val Asn AsnTyr Leu 5025 5030 5035 5040 Leu Gly Asn Glu Pro Pro Val Phe Asp Ile LeuPhe Trp Asn Asn Asp 5045 5050 5055 Thr Thr Arg Leu Pro Ala Ala Phe HisGly Asp Leu Ile Glu Met Phe 5060 5065 5070 Lys Ser Asn Pro Leu Thr ArgPro Asp Ala Leu Glu Val Cys Gly Thr 5075 5080 5085 Pro Ile Asp Leu LysGln Val Lys Cys Asp Ile Tyr Ser Leu Ala Gly 5090 5095 5100 Thr Asn AspHis Ile Thr Pro Trp Gln Ser Cys Tyr Arg Ser Ala His 5105 5110 5115 5120Leu Phe Gly Gly Lys Ile Glu Phe Val Leu Ser Asn Ser Gly His Ile 51255130 5135 Gln Ser Ile Leu Asn Pro Pro Gly Asn Pro Lys Ala Arg Phe MetThr 5140 5145 5150 Gly Ala Asp Arg Pro Gly Asp Pro Val Ala Trp Gln GluAsn Ala Thr 5155 5160 5165 Lys His Ala Asp Ser Trp Trp Leu His Trp GlnSer Trp Leu Gly Glu 5170 5175 5180 Arg Ala Gly Glu Leu Glu Lys Ala ProThr Arg Leu Gly Asn Arg Ala 5185 5190 5195 5200 Tyr Ala Ala Gly Glu AlaSer Pro Gly Thr Tyr Val His Glu Arg 5205 5210 5215 3 12441 DNAStreptomyces venezuelae 3 agatctgcaa cgacatctcc gaccacctgc tcgtcacccgcggcgcgccc gatgcccgcg 60 tcgtgcagcc cccgaccagc cttatcgaag gagcggcgaagagatggcag aacccacggt 120 gaccgacgac ctgacggggg ccctcacgca gcccccgctgggccgcaccg tccgcgcggt 180 ggccgaccgt gaactcggca cccacctcct ggagacccgcggcatccact ggatccacgc 240 cgcgaacggc gacccgtacg ccaccgtgct gcgcggccaggcggacgacc cgtatcccgc 300 gtacgagcgg gtgcgtgccc gcggcgcgct ctccttcagcccgacgggca gctgggtcac 360 cgccgatcac gccctggcgg cgagcatcct ctgctcgacggacttcgggg tctccggcgc 420 cgacggcgtc ccggtgccgc agcaggtcct ctcgtacggggagggctgtc cgctggagcg 480 cgagcaggtg ctgccggcgg ccggtgacgt gccggagggcgggcagcgtg ccgtggtcga 540 ggggatccac cgggagacgc tggagggtct cgcgccggacccgtcggcgt cgtacgcctt 600 cgagctgctg ggcggtttcg tccgcccggc ggtgacggccgctgccgccg ccgtgctggg 660 tgttcccgcg gaccggcgcg cggacttcgc ggatctgctggagcggctcc ggccgctgtc 720 cgacagcctg ctggccccgc agtccctgcg gacggtacgggcggcggacg gcgcgctggc 780 cgagctcacg gcgctgctcg ccgattcgga cgactcccccggggccctgc tgtcggcgct 840 cggggtcacc gcagccgtcc agctcaccgg gaacgcggtgctcgcgctcc tcgcgcatcc 900 cgagcagtgg cgggagctgt gcgaccggcc cgggctcgcggcggccgcgg tggaggagac 960 cctccgctac gacccgccgg tgcagctcga cgcccgggtggtccgcgggg agacggagct 1020 ggcgggccgg cggctgccgg ccggggcgca tgtcgtcgtcctgaccgccg cgaccggccg 1080 ggacccggag gtcttcacgg acccggagcg cttcgacctcgcgcgccccg acgccgccgc 1140 gcacctcgcg ctgcaccccg ccggtccgta cggcccggtggcgtccctgg tccggcttca 1200 ggcggaggtc gcgctgcgga ccctggccgg gcgtttccccgggctgcggc aggcggggga 1260 cgtgctccgc ccccgccgcg cgcctgtcgg ccgcgggccgctgagcgtcc cggtcagcag 1320 ctcctgagac accggggccc cggtccgccc ggccccccttcggacggacc ggacggctcg 1380 gaccacgggg acggctcaga ccgtcccgtg tgtccccgtccggctcccgt ccgccccatc 1440 ccgcccctcc accggcaagg aaggacacga cgccatgcgcgtcctgctga cctcgttcgc 1500 acatcacacg cactactacg gcctggtgcc cctggcctgggcgctgctcg ccgccgggca 1560 cgaggtgcgg gtcgccagcc agcccgcgct cacggacaccatcaccgggt ccgggctcgc 1620 cgcggtgccg gtcggcaccg accacctcat ccacgagtaccgggtgcgga tggcgggcga 1680 gccgcgcccg aaccatccgg cgatcgcctt cgacgaggcccgtcccgagc cgctggactg 1740 ggaccacgcc ctcggcatcg aggcgatcct cgccccgtacttccatctgc tcgccaacaa 1800 cgactcgatg gtcgacgacc tcgtcgactt cgcccggtcctggcagccgg acctggtgct 1860 gtgggagccg acgacctacg cgggcgccgt cgccgcccaggtcaccggtg ccgcgcacgc 1920 ccgggtcctg tgggggcccg acgtgatggg cagcgcccgccgcaagttcg tcgcgctgcg 1980 ggaccggcag ccgcccgagc accgcgagga ccccaccgcggagtggctga cgtggacgct 2040 cgaccggtac ggcgcctcct tcgaagagga gctgctcaccggccagttca cgatcgaccc 2100 gaccccgccg agcctgcgcc tcgacacggg cctgccgaccgtcgggatgc gttatgttcc 2160 gtacaacggc acgtcggtcg tgccggactg gctgagtgagccgcccgcgc ggccccgggt 2220 ctgcctgacc ctcggcgtct ccgcgcgtga ggtcctcggcggcgacggcg tctcgcaggg 2280 cgacatcctg gaggcgctcg ccgacctcga catcgagctcgtcgccacgc tcgacgcgag 2340 tcagcgcgcc gagatccgca actacccgaa gcacacccggttcacggact tcgtgccgat 2400 gcacgcgctc ctgccgagct gctcggcgat catccaccacggcggggcgg gcacctacgc 2460 gaccgccgtg atcaacgcgg tgccgcaggt catgctcgccgagctgtggg acgcgccggt 2520 caaggcgcgg gccgtcgccg agcagggggc ggggttcttcctgccgccgg ccgagctcac 2580 gccgcaggcc gtgcgggacg ccgtcgtccg catcctcgacgacccctcgg tcgccaccgc 2640 cgcgcaccgg ctgcgcgagg agaccttcgg cgaccccaccccggccggga tcgtccccga 2700 gctggagcgg ctcgccgcgc agcaccgccg cccgccggccgacgcccggc actgagccgc 2760 acccctcgcc ccaggcctca cccctgtatc tgcgccgggggacgcccccg gcccaccctc 2820 cgaaagaccg aaagcaggag caccgtgtac gaagtcgaccacgccgacgt ctacgacctc 2880 ttctacctgg gtcgcggcaa ggactacgcc gccgaggcctccgacatcgc cgacctggtg 2940 cgctcccgta cccccgaggc ctcctcgctc ctggacgtggcctgcggtac gggcacgcat 3000 ctggagcact tcaccaagga gttcggcgac accgccggcctggagctgtc cgaggacatg 3060 ctcacccacg cccgcaagcg gctgcccgac gccacgctccaccagggcga catgcgggac 3120 ttccggctcg gccggaagtt ctccgccgtg gtcagcatgttcagctccgt cggctacctg 3180 aagacgaccg aggaactcgg cgcggccgtc gcctcgttcgcggagcacct ggagcccggt 3240 ggcgtcgtcg tcgtcgagcc gtggtggttc ccggagaccttcgccgacgg ctgggtcagc 3300 gccgacgtcg tccgccgtga cgggcgcacc gtggcccgtgtctcgcactc ggtgcgggag 3360 gggaacgcga cgcgcatgga ggtccacttc accgtggccgacccgggcaa gggcgtgcgg 3420 cacttctccg acgtccatct catcaccctg ttccaccaggccgagtacga ggccgcgttc 3480 acggccgccg ggctgcgcgt cgagtacctg gagggcggcccgtcgggccg tggcctcttc 3540 gtcggcgtcc ccgcctgagc accgcccaag accccccggggcgggacgtc ccgggtgcac 3600 caagcaaaga gagagaaacg aaccgtgaca ggtaagacccgaataccgcg tgtccgccgc 3660 ggccgcacca cgcccagggc cttcaccctg gccgtcgtcggcaccctgct ggcgggcacc 3720 accgtggcgg ccgccgctcc cggcgccgcc gacacggccaatgttcagta cacgagccgg 3780 gcggcggagc tcgtcgccca gatgacgctc gacgagaagatcagcttcgt ccactgggcg 3840 ctggaccccg accggcagaa cgtcggctac cttcccggcgtgccgcgtct gggcatcccg 3900 gagctgcgtg ccgccgacgg cccgaacggc atccgcctggtggggcagac cgccaccgcg 3960 ctgcccgcgc cggtcgccct ggccagcacc ttcgacgacaccatggccga cagctacggc 4020 aaggtcatgg gccgcgacgg tcgcgcgctc aaccaggacatggtcctggg cccgatgatg 4080 aacaacatcc gggtgccgca cggcggccgg aactacgagaccttcagcga ggaccccctg 4140 gtctcctcgc gcaccgcggt cgcccagatc aagggcatccagggtgcggg tctgatgacc 4200 acggccaagc acttcgcggc caacaaccag gagaacaaccgcttctccgt gaacgccaat 4260 gtcgacgagc agacgctccg cgagatcgag ttcccggcgttcgaggcgtc ctccaaggcc 4320 ggcgcggcct ccttcatgtg tgcctacaac ggcctcaacgggaagccgtc ctgcggcaac 4380 gacgagctcc tcaacaacgt gctgcgcacg cagtggggcttccagggctg ggtgatgtcc 4440 gactggctcg ccaccccggg caccgacgcc atcaccaagggcctcgacca ggagatgggc 4500 gtcgagctcc ccggcgacgt cccgaagggc gagccctcgccgccggccaa gttcttcggc 4560 gaggcgctga agacggccgt cctgaacggc acggtccccgaggcggccgt gacgcggtcg 4620 gcggagcgga tcgtcggcca gatggagaag ttcggtctgctcctcgccac tccggcgccg 4680 cggcccgagc gcgacaaggc gggtgcccag gcggtgtcccgcaaggtcgc cgagaacggc 4740 gcggtgctcc tgcgcaacga gggccaggcc ctgccgctcgccggtgacgc cggcaagagc 4800 atcgcggtca tcggcccgac ggccgtcgac cccaaggtcaccggcctggg cagcgcccac 4860 gtcgtcccgg actcggcggc ggcgccactc gacaccatcaaggcccgcgc gggtgcgggt 4920 gcgacggtga cgtacgagac gggtgaggag accttcgggacgcagatccc ggcggggaac 4980 ctcagcccgg cgttcaacca gggccaccag ctcgagccgggcaaggcggg ggcgctgtac 5040 gacggcacgc tgaccgtgcc cgccgacggc gagtaccgcatcgcggtccg tgccaccggt 5100 ggttacgcca cggtgcagct cggcagccac accatcgaggccggtcaggt ctacggcaag 5160 gtgagcagcc cgctcctcaa gctgaccaag ggcacgcacaagctcacgat ctcgggcttc 5220 gcgatgagtg ccaccccgct ctccctggag ctgggctgggtgacgccggc ggcggccgac 5280 gcgacgatcg cgaaggccgt ggagtcggcg cggaaggcccgtacggcggt cgtcttcgcc 5340 tacgacgacg gcaccgaggg cgtcgaccgt ccgaacctgtcgctgccggg tacgcaggac 5400 aagctgatct cggctgtcgc ggacgccaac ccgaacacgatcgtggtcct caacaccggt 5460 tcgtcggtgc tgatgccgtg gctgtccaag acccgcgcggtcctggacat gtggtacccg 5520 ggccaggcgg gcgccgaggc caccgccgcg ctgctctacggtgacgtcaa cccgagcggc 5580 aagctcacgc agagcttccc ggccgccgag aaccagcacgcggtcgccgg cgacccgaca 5640 agctacccgg gcgtcgacaa ccagcagacg taccgcgagggcatccacgt cgggtaccgc 5700 tggttcgaca aggagaacgt caagccgctg ttcccgttcgggcacggcct gtcgtacacc 5760 tcgttcacgc agagcgcccc gaccgtcgtg cgtacgtccacgggtggtct gaaggtcacg 5820 gtcacggtcc gcaacagcgg gaagcgcgcc ggccaggaggtcgtccaggc gtacctcggt 5880 gccagcccga acgtgacggc tccgcaggcg aagaagaagctcgtgggcta cacgaaggtc 5940 tcgctcgccg cgggcgaggc gaagacggtg acggtgaacgtcgaccgccg tcagctgcag 6000 accggttcgt cctccgccga cctgcggggc agcgccacggtcaacgtctg gtgacgtgac 6060 gccgtgaaag cggcggtgcc cgccacccgg gagggtggcgggcaccgctt tttcggcctg 6120 ctgggtctac cggaccacct gactaggcct ggtcgacccgctcggcccat tcgcgcacgg 6180 cgtcgatcac ccgcagcgcc tgcgggcgct ccaggtgcgggccgatcggc aggctgagga 6240 cctgccgcgc gaagctctcg gcccgcggga gcgagccttccggcggtgcc tcgcccgcgt 6300 aggcgggcga gaggtgcacg ggtaccgggt agtgcgtgagggtgtcgatg ccgcgggcgt 6360 cgaggtggct gcgcagctcg tcgcggcgct cggtgcgcacggtgaagagg tgccagaccg 6420 ggtcggtgtc gggcgcggtc accggcaggc cgatgccgggcagtccggcg agcccggaga 6480 ggtactccgc ggccagcgcc gacctgcggc cgttccagctgtccaggtgg gcgagccgga 6540 tccgcagcac ggcggcctgc atctcgtcca ggcgggagttggtgcccttc gtctcgtggc 6600 tgtacttctg ccgcgagccg tagttgcgga gcatccggagccgttcggcg agctcggggt 6660 cgccggtgac gacggcgccg ccgtcgccga agcagccgaggttcttgccc gggtagaagc 6720 tgaacgcggc caccgacgac ccggcgccga tccgccggccccggtagcgg gcgccgtggg 6780 cctgcgcggc gtcctcgacg atgtgcaggc cgtgccggtccgcgagctcg cggagggcgt 6840 ccatgtcggc ggggtgcccg tagaggtgga cggggaggagcgcccgggtg cggggggtga 6900 tcgccttctc gacgagcagc gggtccaggg tggggtggtcctcgtgcggc tcgacgggca 6960 cgggggtcgc gccggtggcg gacaccgcga gccagctggcgatgtacgtg tgcgagggga 7020 cgatcacctc gtccccgggt ccgatgccga ggccgcggagggcgagctgg agggcgtcca 7080 tcccgctgtt cacgccgacg gcgtggtccg tctcgcagtacgcggcgaac tccgcctcga 7140 atccttcgag ttcgggtccg aggaggtagc gccccgagtcgaggacgcgg gcgatcgcgg 7200 cgtcggtctc cgcgcggagc tcctcgtagg cggccttgaggtcgaggaag gggacgcggg 7260 gggtctcggc gcggctgctc acgcggacac ctccacggcggtggcgggca gctgcggggc 7320 ggtcgccttg agcggctccc accagccgcg gttctcccggtaccagcgga cggtccgcgc 7380 gaggccgtcc gcgaaggaga cctgcgggcg gtagccgagctcgcgctcga tctcgccgcc 7440 gtcgagggag tagcgcaggt cgtggccctt gcggtcggcgaccttccgga ccgaggacca 7500 gtcggcgccg agcgagtcca ggaggatgcc ggtgagttcgcggttggtca gctccaggcc 7560 gccgccgatg tggtagatct cgccggcccg gccgcccgcgaggacgagcg cgatgccccg 7620 gcagtggtcg tcggtgtgca cccactcgcg gacgttcgcgccgtcgccgt acagcgggag 7680 cgtcccgccg tcgaggaggt tcgtcacgaa gagggggatgagcttctcgg ggtgctggta 7740 cggcccgtag ttgttgcagc agcgggtgat ccgtacgtcgaggccgtacg tccggtggta 7800 ggcgcgggca acgaggtcgg agccggcctt ggacgccgcgtagggcgagt tgggctccag 7860 cgggctgctc tcggtccagg agccggagtc gatcgacccgtacacctcgt cggtggagac 7920 gtgcacgacc cggccgacgc cggcgtcgac ggcgcactggagcagcgtct gcgtgccctg 7980 cacgttggtc tcggtgaaca cggacgcgcc cgcgatggagcggtccacgt ggctctcggc 8040 cgcgaagtgg acgatggcgt ccacgccgcg cagttcccgggcgaggaggc cggcgtcgcg 8100 gatgtcgccg tggacgaagc gcagtcgcgg gtccgcgtccaccggggcga ggttggcgcg 8160 gttgcccgcg taggtgaggc tgtccaggac gatcacctcatcggcgggca cgtcggggta 8220 cgccccggcg aggagctgcc gcacgaagtg cgagccgatgaagcccgcac ctccggtcac 8280 cagaagccgc actgccgtct tcctttcggt cgcgctgtaggtcgcggtgt gggtcgcact 8340 gtcggtggcg gtgcgggtcg cggtgtgggt cgcactgtcggtggcgctgt cggtcgtggg 8400 aacgcgtcgg ccgcgaggtg ccctcacggg gctccctcgcggccggcgat ctccatcaga 8460 tagctgccgt actcggtgcg ggagaggcct tctcccaggccgtgacaggc ctcggcgtcg 8520 atgaagccca tgcggaaggc gatctcctca aggcccgcgatccagacgcc ctgccgctcc 8580 tccaggacct ggacgtactg ggcggcccgc aggagcgagtcgtgggtgcc ggtgtccagc 8640 caggcgaagc cgcggcccag gttgacgagt tcggcccggccccgctccag gtagacgcgg 8700 ttgacgtcgg tgatctccag ctcgccgcgc ggcgagggccggatgttctt ggcgatgtcg 8760 acgacgtcgt tgtcgtagag gtagaggccg gtgacggcgaggttggagcg cggcttgacg 8820 ggcttctcga cgaggtcggt cagccggccc gtcgcgtccacctcggcgac gccgtaccgc 8880 tcggggtcct tgaccgggta gccgaagagc acgcagccgtcgaggcgcgc gatgctgtcc 8940 cgcaggagcg tgtagaggcc gggcccgtgg aagatgttgtcgcccaggat cagggcgcag 9000 gtgtcgtcgc cgatgtgctc ggctccgacg agaagtgcgtccgcgattcc tgcgggctct 9060 ttctggaccg catagtcgag ttctattccc aggtgcctgccgtttccgag aagcgactgg 9120 aagagttcga tgtgctgggg ggtcgagatg atttgaatctcgcgaatacc gccgagcatg 9180 agaaccgaca gcggatagta gatcatcggt ttgttgtagaccggaagaat ctgcttcgaa 9240 atgaccgagg tcgccggatg cagccgagtt ccgctcccgccggccaggac tattcccttc 9300 attctcggaa actagcagca gggcgccggt gataacggtcggcgtggcga gttagggggg 9360 cgctaggggc tgcgcagggg gagtgtcacc acccctttggggggtgggaa aacaccgagg 9420 gcccggccgg acggccgggc cctcaggtgg ggggatcgtgggggggggat cggggggatc 9480 ggggcgggtg cgggtcagcg caggaagccg cgggcctcctcccagccgtc cgcggcgtcg 9540 cgctccagct ggttcaggcg ggcggtgacg acctgatcgaagccgtccat gaagtactcg 9600 tcgccgtcga cggccgccac ctcgccgccg cgctcgacgaagtccctgac gacctcggtg 9660 agggaggtgt cgggggtcac gcggcccgcg atgtagcgggtcgcgccgtc caggtcgggg 9720 aagccggcct cgcggtacag gtacacgtcg ccgaggagatcgacctgcac cgcgacctgc 9780 gggtgcgcgg tgggccgcat ggtggcgggc ttgatccgcagcagttcggc gtcggccccg 9840 gtgcgcaggc tgttcagggc gtagccgtag tcgatgtggagtccgggggt gcgctcgcgg 9900 acccgctcct cgaaggcgtt gagggcctcc tggagctcggcccgctcctc ctgcggcagc 9960 ttgccgtcgt cacggccgct gtagtcctcg cgaatgttgacgaagtcgat cgtcctgccc 10020 tgcccggcgt cgttgaggtc ggcgatgaag tcgaccaggtcgagcaggcg ggaggcacgg 10080 cccgggagca cgatgtaggc gaagccgagg ttgatcggcgactcgcgctc ggcgcgcagc 10140 tgctggaagc ggcgcaggtt ctcgcggacg cggcggaaggcggccttctt gccggtggtc 10200 tgctcgtact cctcgtcgtt gaggccgtag agcgaggtgcggatggcgtg caggccccag 10260 aggccgggct ggcgctccag ggtgcgctcg gtgagcgcgaaggagttcgt gtagacggtg 10320 ggccgcaggc cgtggtcggt ggcgtgcgcg gccaggctcccgaggccggg gttggtgagc 10380 ggctccaggc cgccggagaa gtacatcgcc gaggggttgcccgcgggtat ctcgtcgatg 10440 accgaccgga acatggcgtt gccggcgtcg agggcggacgggtcgtagcg ggcgccggtc 10500 acacggacgc agaagtggca gcggaacatg caggtcgggccggggtagag gccgacgctg 10560 tacgggaaga cgggcttcct ggcgagcgcc gcgtcgaagacgccgcgctg ttcgagcggg 10620 agcagggtgt tcttccagta cgccccggcg gggccggtctcgaccgcggt gcggagctcc 10680 gggacctgcc cgaacagggc gaggaggcgc cggaaggcgtcccggtcgac gcccaggtcg 10740 tggcgggcct cctccagcgg ggtgaagggg ctgttgccgtagcgcacggc gagccggacg 10800 aggtggcggg cggtcgttcc ggcctcgtcg ggcggcacgaggccgccggc ggcgagggtc 10860 tggccgacgg cgtggaccgc cgcccccaga tcggctccggggtgcgcgca gcgttcggcc 10920 ggggcggtgg cggaaagggc gggggcggtc atcgggagcgtccaatcgtg ggcgtggatg 10980 tctggggggc cgcgagcggg gcgggggccg tgtcgcggtggcgcgcggtc agttcgcggc 11040 cgcgggtcgc gcagagacgc agcaggtcgg cgacccggcggatgtcgtcg tcgccgatgg 11100 cggtgccggt cggcagggac agcacgcgcg cggcgaggcgttcggtgtgc ggcagcgggg 11160 cgtgcggctg cccgcggtac ggctccagct cgtggcagcccggcgagaag taggcgcggg 11220 tgtgcacgcc ttcggccttc aggacctcca tgacgaggtcgcggtggatg ccggtggtgg 11280 cctcgtcgat ctcgacgatc acgtactggt ggttgttgaggccgtggcgg tcgtggtcgg 11340 cgacgaggac gccggggagg tccgcgaggt gctcgcggtaggcggcgtgg ttgcgccggt 11400 tccggtcgat gacctcggga aacgcgtcga gggaggtgaggcccatggcg gcggcggcct 11460 cgctcatctt ggcgttggtc ccgccggcgg ggctgccgccgggcaggtcg aagccgaagt 11520 tgtggagggc gcggatccgg gcggcgaggt cggcgtcgtcggtgacgacg gcgccgccct 11580 cgaaggcgtt gacggccttg gtggcgtgga agctgaagacctcggcgtcg ccgaggctgc 11640 cggcgggccg gccgtcgacc gcgcagccga gggcgtgcgcggcgtcgaag tacagccgca 11700 ggccgtgctc gtcggcgacc ttccgcagct ggtcggcggcgcaggggcgg ccccagaggt 11760 ggacgccgac gacggccgag gtgcggggtg tgaccgcggcggccacctgg tccgggtcga 11820 ggttgccggt gtccgggtcg atgtcggcga agaccggggtgaggccgatc cagcgcagtg 11880 cgtgcggggt ggcggcgaac gtcatcgacg gcatgatcacttcgccggtg aggccggcgg 11940 cgtgcgcgag gagctggagc ccggccgtgg cgttgcaggtggccacggca tgccggaccc 12000 cggcgagccc ggcgacgcgc tcctcgaact cgcggacgagcgggccgccg ttggacagcc 12060 actggctgtc gagggcccgg tcgagccgct cgtacagcctggcgcggtcg atgcggttgg 12120 gccgccccac gaggagcggc tggtcgaaag cggcggggccgccgaagaat gcgaggtcgg 12180 ataaggcgct tttcacggat gttccctccg ggccaccgtcacgaaatgat tcgccgatcc 12240 gggaatcccg aacgaggtcg ccgcgctcca ccgtgacgtacgacgagatg gtcgattgtg 12300 gtggtcgatt tcggggggac tctaatccgc gcggaacgggaccgacaaga gcacgctatg 12360 cgctctcgat gtgcttcgga tcacatccgc ctccggggtattccatcggc ggcccgaatg 12420 tgatgatcct tgacaggatc c 12441 4 3782 PRTStreptomyces venezuelae 4 Met Thr Asp Asp Leu Thr Gly Ala Leu Thr GlnPro Pro Leu Gly Arg 1 5 10 15 Thr Val Arg Ala Val Ala Asp Arg Glu LeuGly Thr His Leu Leu Glu 20 25 30 Thr Arg Gly Ile His Trp Ile His Ala AlaAsn Gly Asp Pro Tyr Ala 35 40 45 Thr Val Leu Arg Gly Gln Ala Asp Asp ProTyr Pro Ala Tyr Glu Arg 50 55 60 Val Arg Ala Arg Gly Ala Leu Ser Phe SerPro Thr Gly Ser Trp Val 65 70 75 80 Thr Ala Asp His Ala Leu Ala Ala SerIle Leu Cys Ser Thr Asp Phe 85 90 95 Gly Val Ser Gly Ala Asp Gly Val ProVal Pro Gln Gln Val Leu Ser 100 105 110 Tyr Gly Glu Gly Cys Pro Leu GluArg Glu Gln Val Leu Pro Ala Ala 115 120 125 Gly Asp Val Pro Glu Gly GlyGln Arg Ala Val Val Glu Gly Ile His 130 135 140 Arg Glu Thr Leu Glu GlyLeu Ala Pro Asp Pro Ser Ala Ser Tyr Ala 145 150 155 160 Phe Glu Leu LeuGly Gly Phe Val Arg Pro Ala Val Thr Ala Ala Ala 165 170 175 Ala Ala ValLeu Gly Val Pro Ala Asp Arg Arg Ala Asp Phe Ala Asp 180 185 190 Leu LeuGlu Arg Leu Arg Pro Leu Ser Asp Ser Leu Leu Ala Pro Gln 195 200 205 SerLeu Arg Thr Val Arg Ala Ala Asp Gly Ala Leu Ala Glu Leu Thr 210 215 220Ala Leu Leu Ala Asp Ser Asp Asp Ser Pro Gly Ala Leu Leu Ser Ala 225 230235 240 Leu Gly Val Thr Ala Ala Val Gln Leu Thr Gly Asn Ala Val Leu Ala245 250 255 Leu Leu Ala His Pro Glu Gln Trp Arg Glu Leu Cys Asp Arg ProGly 260 265 270 Leu Ala Ala Ala Ala Val Glu Glu Thr Leu Arg Tyr Asp ProPro Val 275 280 285 Gln Leu Asp Ala Arg Val Val Arg Gly Glu Thr Glu LeuAla Gly Arg 290 295 300 Arg Leu Pro Ala Gly Ala His Val Val Val Leu ThrAla Ala Thr Gly 305 310 315 320 Arg Asp Pro Glu Val Phe Thr Asp Pro GluArg Phe Asp Leu Ala Arg 325 330 335 Pro Asp Ala Ala Ala His Leu Ala LeuHis Pro Ala Gly Pro Tyr Gly 340 345 350 Pro Val Ala Ser Leu Val Arg LeuGln Ala Glu Val Ala Leu Arg Thr 355 360 365 Leu Ala Gly Arg Phe Pro GlyLeu Arg Gln Ala Gly Asp Val Leu Arg 370 375 380 Pro Arg Arg Ala Pro ValGly Arg Gly Pro Leu Ser Val Pro Val Ser 385 390 395 400 Ser Ser Met ArgVal Leu Leu Thr Ser Phe Ala His His Thr His Tyr 405 410 415 Tyr Gly LeuVal Pro Leu Ala Trp Ala Leu Leu Ala Ala Gly His Glu 420 425 430 Val ArgVal Ala Ser Gln Pro Ala Leu Thr Asp Thr Ile Thr Gly Ser 435 440 445 GlyLeu Ala Ala Val Pro Val Gly Thr Asp His Leu Ile His Glu Tyr 450 455 460Arg Val Arg Met Ala Gly Glu Pro Arg Pro Asn His Pro Ala Ile Ala 465 470475 480 Phe Asp Glu Ala Arg Pro Glu Pro Leu Asp Trp Asp His Ala Leu Gly485 490 495 Ile Glu Ala Ile Leu Ala Pro Tyr Phe His Leu Leu Ala Asn AsnAsp 500 505 510 Ser Met Val Asp Asp Leu Val Asp Phe Ala Arg Ser Trp GlnPro Asp 515 520 525 Leu Val Leu Trp Glu Pro Thr Thr Tyr Ala Gly Ala ValAla Ala Gln 530 535 540 Val Thr Gly Ala Ala His Ala Arg Val Leu Trp GlyPro Asp Val Met 545 550 555 560 Gly Ser Ala Arg Arg Lys Phe Val Ala LeuArg Asp Arg Gln Pro Pro 565 570 575 Glu His Arg Glu Asp Pro Thr Ala GluTrp Leu Thr Trp Thr Leu Asp 580 585 590 Arg Tyr Gly Ala Ser Phe Glu GluGlu Leu Leu Thr Gly Gln Phe Thr 595 600 605 Ile Asp Pro Thr Pro Pro SerLeu Arg Leu Asp Thr Gly Leu Pro Thr 610 615 620 Val Gly Met Arg Tyr ValPro Tyr Asn Gly Thr Ser Val Val Pro Asp 625 630 635 640 Trp Leu Ser GluPro Pro Ala Arg Pro Arg Val Cys Leu Thr Leu Gly 645 650 655 Val Ser AlaArg Glu Val Leu Gly Gly Asp Gly Val Ser Gln Gly Asp 660 665 670 Ile LeuGlu Ala Leu Ala Asp Leu Asp Ile Glu Leu Val Ala Thr Leu 675 680 685 AspAla Ser Gln Arg Ala Glu Ile Arg Asn Tyr Pro Lys His Thr Arg 690 695 700Phe Thr Asp Phe Val Pro Met His Ala Leu Leu Pro Ser Cys Ser Ala 705 710715 720 Ile Ile His His Gly Gly Ala Gly Thr Tyr Ala Thr Ala Val Ile Asn725 730 735 Ala Val Pro Gln Val Met Leu Ala Glu Leu Trp Asp Ala Pro ValLys 740 745 750 Ala Arg Ala Val Ala Glu Gln Gly Ala Gly Phe Phe Leu ProPro Ala 755 760 765 Glu Leu Thr Pro Gln Ala Val Arg Asp Ala Val Val ArgIle Leu Asp 770 775 780 Asp Pro Ser Val Ala Thr Ala Ala His Arg Leu ArgGlu Glu Thr Phe 785 790 795 800 Gly Asp Pro Thr Pro Ala Gly Ile Val ProGlu Leu Glu Arg Leu Ala 805 810 815 Ala Gln His Arg Arg Pro Pro Ala AspAla Arg His Met Tyr Glu Val 820 825 830 Asp His Ala Asp Val Tyr Asp LeuPhe Tyr Leu Gly Arg Gly Lys Asp 835 840 845 Tyr Ala Ala Glu Ala Ser AspIle Ala Asp Leu Val Arg Ser Arg Thr 850 855 860 Pro Glu Ala Ser Ser LeuLeu Asp Val Ala Cys Gly Thr Gly Thr His 865 870 875 880 Leu Glu His PheThr Lys Glu Phe Gly Asp Thr Ala Gly Leu Glu Leu 885 890 895 Ser Glu AspMet Leu Thr His Ala Arg Lys Arg Leu Pro Asp Ala Thr 900 905 910 Leu HisGln Gly Asp Met Arg Asp Phe Arg Leu Gly Arg Lys Phe Ser 915 920 925 AlaVal Val Ser Met Phe Ser Ser Val Gly Tyr Leu Lys Thr Thr Glu 930 935 940Glu Leu Gly Ala Ala Val Ala Ser Phe Ala Glu His Leu Glu Pro Gly 945 950955 960 Gly Val Val Val Val Glu Pro Trp Trp Phe Pro Glu Thr Phe Ala Asp965 970 975 Gly Trp Val Ser Ala Asp Val Val Arg Arg Asp Gly Arg Thr ValAla 980 985 990 Arg Val Ser His Ser Val Arg Glu Gly Asn Ala Thr Arg MetGlu Val 995 1000 1005 His Phe Thr Val Ala Asp Pro Gly Lys Gly Val ArgHis Phe Ser Asp 1010 1015 1020 Val His Leu Ile Thr Leu Phe His Gln AlaGlu Tyr Glu Ala Ala Phe 1025 1030 1035 1040 Thr Ala Ala Gly Leu Arg ValGlu Tyr Leu Glu Gly Gly Pro Ser Gly 1045 1050 1055 Arg Gly Leu Phe ValGly Val Pro Ala Met Thr Gly Lys Thr Arg Ile 1060 1065 1070 Pro Arg ValArg Arg Gly Arg Thr Thr Pro Arg Ala Phe Thr Leu Ala 1075 1080 1085 ValVal Gly Thr Leu Leu Ala Gly Thr Thr Val Ala Ala Ala Ala Pro 1090 10951100 Gly Ala Ala Asp Thr Ala Asn Val Gln Tyr Thr Ser Arg Ala Ala Glu1105 1110 1115 1120 Leu Val Ala Gln Met Thr Leu Asp Glu Lys Ile Ser PheVal His Trp 1125 1130 1135 Ala Leu Asp Pro Asp Arg Gln Asn Val Gly TyrLeu Pro Gly Val Pro 1140 1145 1150 Arg Leu Gly Ile Pro Glu Leu Arg AlaAla Asp Gly Pro Asn Gly Ile 1155 1160 1165 Arg Leu Val Gly Gln Thr AlaThr Ala Leu Pro Ala Pro Val Ala Leu 1170 1175 1180 Ala Ser Thr Phe AspAsp Thr Met Ala Asp Ser Tyr Gly Lys Val Met 1185 1190 1195 1200 Gly ArgAsp Gly Arg Ala Leu Asn Gln Asp Met Val Leu Gly Pro Met 1205 1210 1215Met Asn Asn Ile Arg Val Pro His Gly Gly Arg Asn Tyr Glu Thr Phe 12201225 1230 Ser Glu Asp Pro Leu Val Ser Ser Arg Thr Ala Val Ala Gln IleLys 1235 1240 1245 Gly Ile Gln Gly Ala Gly Leu Met Thr Thr Ala Lys HisPhe Ala Ala 1250 1255 1260 Asn Asn Gln Glu Asn Asn Arg Phe Ser Val AsnAla Asn Val Asp Glu 1265 1270 1275 1280 Gln Thr Leu Arg Glu Ile Glu PhePro Ala Phe Glu Ala Ser Ser Lys 1285 1290 1295 Ala Gly Ala Ala Ser PheMet Cys Ala Tyr Asn Gly Leu Asn Gly Lys 1300 1305 1310 Pro Ser Cys GlyAsn Asp Glu Leu Leu Asn Asn Val Leu Arg Thr Gln 1315 1320 1325 Trp GlyPhe Gln Gly Trp Val Met Ser Asp Trp Leu Ala Thr Pro Gly 1330 1335 1340Thr Asp Ala Ile Thr Lys Gly Leu Asp Gln Glu Met Gly Val Glu Leu 13451350 1355 1360 Pro Gly Asp Val Pro Lys Gly Glu Pro Ser Pro Pro Ala LysPhe Phe 1365 1370 1375 Gly Glu Ala Leu Lys Thr Ala Val Leu Asn Gly ThrVal Pro Glu Ala 1380 1385 1390 Ala Val Thr Arg Ser Ala Glu Arg Ile ValGly Gln Met Glu Lys Phe 1395 1400 1405 Gly Leu Leu Leu Ala Thr Pro AlaPro Arg Pro Glu Arg Asp Lys Ala 1410 1415 1420 Gly Ala Gln Ala Val SerArg Lys Val Ala Glu Asn Gly Ala Val Leu 1425 1430 1435 1440 Leu Arg AsnGlu Gly Gln Ala Leu Pro Leu Ala Gly Asp Ala Gly Lys 1445 1450 1455 SerIle Ala Val Ile Gly Pro Thr Ala Val Asp Pro Lys Val Thr Gly 1460 14651470 Leu Gly Ser Ala His Val Val Pro Asp Ser Ala Ala Ala Pro Leu Asp1475 1480 1485 Thr Ile Lys Ala Arg Ala Gly Ala Gly Ala Thr Val Thr TyrGlu Thr 1490 1495 1500 Gly Glu Glu Thr Phe Gly Thr Gln Ile Pro Ala GlyAsn Leu Ser Pro 1505 1510 1515 1520 Ala Phe Asn Gln Gly His Gln Leu GluPro Gly Lys Ala Gly Ala Leu 1525 1530 1535 Tyr Asp Gly Thr Leu Thr ValPro Ala Asp Gly Glu Tyr Arg Ile Ala 1540 1545 1550 Val Arg Ala Thr GlyGly Tyr Ala Thr Val Gln Leu Gly Ser His Thr 1555 1560 1565 Ile Glu AlaGly Gln Val Tyr Gly Lys Val Ser Ser Pro Leu Leu Lys 1570 1575 1580 LeuThr Lys Gly Thr His Lys Leu Thr Ile Ser Gly Phe Ala Met Ser 1585 15901595 1600 Ala Thr Pro Leu Ser Leu Glu Leu Gly Trp Val Thr Pro Ala AlaAla 1605 1610 1615 Asp Ala Thr Ile Ala Lys Ala Val Glu Ser Ala Arg LysAla Arg Thr 1620 1625 1630 Ala Val Val Phe Ala Tyr Asp Asp Gly Thr GluGly Val Asp Arg Pro 1635 1640 1645 Asn Leu Ser Leu Pro Gly Thr Gln AspLys Leu Ile Ser Ala Val Ala 1650 1655 1660 Asp Ala Asn Pro Asn Thr IleVal Val Leu Asn Thr Gly Ser Ser Val 1665 1670 1675 1680 Leu Met Pro TrpLeu Ser Lys Thr Arg Ala Val Leu Asp Met Trp Tyr 1685 1690 1695 Pro GlyGln Ala Gly Ala Glu Ala Thr Ala Ala Leu Leu Tyr Gly Asp 1700 1705 1710Val Asn Pro Ser Gly Lys Leu Thr Gln Ser Phe Pro Ala Ala Glu Asn 17151720 1725 Gln His Ala Val Ala Gly Asp Pro Thr Ser Tyr Pro Gly Val AspAsn 1730 1735 1740 Gln Gln Thr Tyr Arg Glu Gly Ile His Val Gly Tyr ArgTrp Phe Asp 1745 1750 1755 1760 Lys Glu Asn Val Lys Pro Leu Phe Pro PheGly His Gly Leu Ser Tyr 1765 1770 1775 Thr Ser Phe Thr Gln Ser Ala ProThr Val Val Arg Thr Ser Thr Gly 1780 1785 1790 Gly Leu Lys Val Thr ValThr Val Arg Asn Ser Gly Lys Arg Ala Gly 1795 1800 1805 Gln Glu Val ValGln Ala Tyr Leu Gly Ala Ser Pro Asn Val Thr Ala 1810 1815 1820 Pro GlnAla Lys Lys Lys Leu Val Gly Tyr Thr Lys Val Ser Leu Ala 1825 1830 18351840 Ala Gly Glu Ala Lys Thr Val Thr Val Asn Val Asp Arg Arg Gln Leu1845 1850 1855 Gln Thr Gly Ser Ser Ser Ala Asp Leu Arg Gly Ser Ala ThrVal Asn 1860 1865 1870 Val Trp Met Ser Ser Arg Ala Glu Thr Pro Arg ValPro Phe Leu Asp 1875 1880 1885 Leu Lys Ala Ala Tyr Glu Glu Leu Arg AlaGlu Thr Asp Ala Ala Ile 1890 1895 1900 Ala Arg Val Leu Asp Ser Gly ArgTyr Leu Leu Gly Pro Glu Leu Glu 1905 1910 1915 1920 Gly Phe Glu Ala GluPhe Ala Ala Tyr Cys Glu Thr Asp His Ala Val 1925 1930 1935 Gly Val AsnSer Gly Met Asp Ala Leu Gln Leu Ala Leu Arg Gly Leu 1940 1945 1950 GlyIle Gly Pro Gly Asp Glu Val Ile Val Pro Ser His Thr Tyr Ile 1955 19601965 Ala Ser Trp Leu Ala Val Ser Ala Thr Gly Ala Thr Pro Val Pro Val1970 1975 1980 Glu Pro His Glu Asp His Pro Thr Leu Asp Pro Leu Leu ValGlu Lys 1985 1990 1995 2000 Ala Ile Thr Pro Arg Thr Arg Ala Leu Leu ProVal His Leu Tyr Gly 2005 2010 2015 His Pro Ala Asp Met Asp Ala Leu ArgGlu Leu Ala Asp Arg His Gly 2020 2025 2030 Leu His Ile Val Glu Asp AlaAla Gln Ala His Gly Ala Arg Tyr Arg 2035 2040 2045 Gly Arg Arg Ile GlyAla Gly Ser Ser Val Ala Ala Phe Ser Phe Tyr 2050 2055 2060 Pro Gly LysAsn Leu Gly Cys Phe Gly Asp Gly Gly Ala Val Val Thr 2065 2070 2075 2080Gly Asp Pro Glu Leu Ala Glu Arg Leu Arg Met Leu Arg Asn Tyr Gly 20852090 2095 Ser Arg Gln Lys Tyr Ser His Glu Thr Lys Gly Thr Asn Ser ArgLeu 2100 2105 2110 Asp Glu Met Gln Ala Ala Val Leu Arg Ile Arg Leu AlaHis Leu Asp 2115 2120 2125 Ser Trp Asn Gly Arg Arg Ser Ala Leu Ala AlaGlu Tyr Leu Ser Gly 2130 2135 2140 Leu Ala Gly Leu Pro Gly Ile Gly LeuPro Val Thr Ala Pro Asp Thr 2145 2150 2155 2160 Asp Pro Val Trp His LeuPhe Thr Val Arg Thr Glu Arg Arg Asp Glu 2165 2170 2175 Leu Arg Ser HisLeu Asp Ala Arg Gly Ile Asp Thr Leu Thr His Tyr 2180 2185 2190 Pro ValPro Val His Leu Ser Pro Ala Tyr Ala Gly Glu Ala Pro Pro 2195 2200 2205Glu Gly Ser Leu Pro Arg Ala Glu Ser Phe Ala Arg Gln Val Leu Ser 22102215 2220 Leu Pro Ile Gly Pro His Leu Glu Arg Pro Gln Ala Leu Arg ValIle 2225 2230 2235 2240 Asp Ala Val Arg Glu Trp Ala Glu Arg Val Asp GlnAla Met Arg Leu 2245 2250 2255 Leu Val Thr Gly Gly Ala Gly Phe Ile GlySer His Phe Val Arg Gln 2260 2265 2270 Leu Leu Ala Gly Ala Tyr Pro AspVal Pro Ala Asp Glu Val Ile Val 2275 2280 2285 Leu Asp Ser Leu Thr TyrAla Gly Asn Arg Ala Asn Leu Ala Pro Val 2290 2295 2300 Asp Ala Asp ProArg Leu Arg Phe Val His Gly Asp Ile Arg Asp Ala 2305 2310 2315 2320 GlyLeu Leu Ala Arg Glu Leu Arg Gly Val Asp Ala Ile Val His Phe 2325 23302335 Ala Ala Glu Ser His Val Asp Arg Ser Ile Ala Gly Ala Ser Val Phe2340 2345 2350 Thr Glu Thr Asn Val Gln Gly Thr Gln Thr Leu Leu Gln CysAla Val 2355 2360 2365 Asp Ala Gly Val Gly Arg Val Val His Val Ser ThrAsp Glu Val Tyr 2370 2375 2380 Gly Ser Ile Asp Ser Gly Ser Trp Thr GluSer Ser Pro Leu Glu Pro 2385 2390 2395 2400 Asn Ser Pro Tyr Ala Ala SerLys Ala Gly Ser Asp Leu Val Ala Arg 2405 2410 2415 Ala Tyr His Arg ThrTyr Gly Leu Asp Val Arg Ile Thr Arg Cys Cys 2420 2425 2430 Asn Asn TyrGly Pro Tyr Gln His Pro Glu Lys Leu Ile Pro Leu Phe 2435 2440 2445 ValThr Asn Leu Leu Asp Gly Gly Thr Leu Pro Leu Tyr Gly Asp Gly 2450 24552460 Ala Asn Val Arg Glu Trp Val His Thr Asp Asp His Cys Arg Gly Ile2465 2470 2475 2480 Ala Leu Val Leu Ala Gly Gly Arg Ala Gly Glu Ile TyrHis Ile Gly 2485 2490 2495 Gly Gly Leu Glu Leu Thr Asn Arg Glu Leu ThrGly Ile Leu Leu Asp 2500 2505 2510 Ser Leu Gly Ala Asp Trp Ser Ser ValArg Lys Val Ala Asp Arg Lys 2515 2520 2525 Gly His Asp Leu Arg Tyr SerLeu Asp Gly Gly Glu Ile Glu Arg Glu 2530 2535 2540 Leu Gly Tyr Arg ProGln Val Ser Phe Ala Asp Gly Leu Ala Arg Thr 2545 2550 2555 2560 Val ArgTrp Tyr Arg Glu Asn Arg Gly Trp Trp Glu Pro Leu Lys Ala 2565 2570 2575Thr Ala Pro Gln Leu Pro Ala Thr Ala Val Glu Val Ser Ala Met Lys 25802585 2590 Gly Ile Val Leu Ala Gly Gly Ser Gly Thr Arg Leu His Pro AlaThr 2595 2600 2605 Ser Val Ile Ser Lys Gln Ile Leu Pro Val Tyr Asn LysPro Met Ile 2610 2615 2620 Tyr Tyr Pro Leu Ser Val Leu Met Leu Gly GlyIle Arg Glu Ile Gln 2625 2630 2635 2640 Ile Ile Ser Thr Pro Gln His IleGlu Leu Phe Gln Ser Leu Leu Gly 2645 2650 2655 Asn Gly Arg His Leu GlyIle Glu Leu Asp Tyr Ala Val Gln Lys Glu 2660 2665 2670 Pro Ala Gly IleAla Asp Ala Leu Leu Val Gly Ala Glu His Ile Gly 2675 2680 2685 Asp AspThr Cys Ala Leu Ile Leu Gly Asp Asn Ile Phe His Gly Pro 2690 2695 2700Gly Leu Tyr Thr Leu Leu Arg Asp Ser Ile Ala Arg Leu Asp Gly Cys 27052710 2715 2720 Val Leu Phe Gly Tyr Pro Val Lys Asp Pro Glu Arg Tyr GlyVal Ala 2725 2730 2735 Glu Val Asp Ala Thr Gly Arg Leu Thr Asp Leu ValGlu Lys Pro Val 2740 2745 2750 Lys Pro Arg Ser Asn Leu Ala Val Thr GlyLeu Tyr Leu Tyr Asp Asn 2755 2760 2765 Asp Val Val Asp Ile Ala Lys AsnIle Arg Pro Ser Pro Arg Gly Glu 2770 2775 2780 Leu Glu Ile Thr Asp ValAsn Arg Val Tyr Leu Glu Arg Gly Arg Ala 2785 2790 2795 2800 Glu Leu ValAsn Leu Gly Arg Gly Phe Ala Trp Leu Asp Thr Gly Thr 2805 2810 2815 HisAsp Ser Leu Leu Arg Ala Ala Gln Tyr Val Gln Val Leu Glu Glu 2820 28252830 Arg Gln Gly Val Trp Ile Ala Gly Leu Glu Glu Ile Ala Phe Arg Met2835 2840 2845 Gly Phe Ile Asp Ala Glu Ala Cys His Gly Leu Gly Glu GlyLeu Ser 2850 2855 2860 Arg Thr Glu Tyr Gly Ser Tyr Leu Met Glu Ile AlaGly Arg Glu Gly 2865 2870 2875 2880 Ala Pro Met Thr Ala Pro Ala Leu SerAla Thr Ala Pro Ala Glu Arg 2885 2890 2895 Cys Ala His Pro Gly Ala AspLeu Gly Ala Ala Val His Ala Val Gly 2900 2905 2910 Gln Thr Leu Ala AlaGly Gly Leu Val Pro Pro Asp Glu Ala Gly Thr 2915 2920 2925 Thr Ala ArgHis Leu Val Arg Leu Ala Val Arg Tyr Gly Asn Ser Pro 2930 2935 2940 PheThr Pro Leu Glu Glu Ala Arg His Asp Leu Gly Val Asp Arg Asp 2945 29502955 2960 Ala Phe Arg Arg Leu Leu Ala Leu Phe Gly Gln Val Pro Glu LeuArg 2965 2970 2975 Thr Ala Val Glu Thr Gly Pro Ala Gly Ala Tyr Trp LysAsn Thr Leu 2980 2985 2990 Leu Pro Leu Glu Gln Arg Gly Val Phe Asp AlaAla Leu Ala Arg Lys 2995 3000 3005 Pro Val Phe Pro Tyr Ser Val Gly LeuTyr Pro Gly Pro Thr Cys Met 3010 3015 3020 Phe Arg Cys His Phe Cys ValArg Val Thr Gly Ala Arg Tyr Asp Pro 3025 3030 3035 3040 Ser Ala Leu AspAla Gly Asn Ala Met Phe Arg Ser Val Ile Asp Glu 3045 3050 3055 Ile ProAla Gly Asn Pro Ser Ala Met Tyr Phe Ser Gly Gly Leu Glu 3060 3065 3070Pro Leu Thr Asn Pro Gly Leu Gly Ser Leu Ala Ala His Ala Thr Asp 30753080 3085 His Gly Leu Arg Pro Thr Val Tyr Thr Asn Ser Phe Ala Leu ThrGlu 3090 3095 3100 Arg Thr Leu Glu Arg Gln Pro Gly Leu Trp Gly Leu HisAla Ile Arg 3105 3110 3115 3120 Thr Ser Leu Tyr Gly Leu Asn Asp Glu GluTyr Glu Gln Thr Thr Gly 3125 3130 3135 Lys Lys Ala Ala Phe Arg Arg ValArg Glu Asn Leu Arg Arg Phe Gln 3140 3145 3150 Gln Leu Arg Ala Glu ArgGlu Ser Pro Ile Asn Leu Gly Phe Ala Tyr 3155 3160 3165 Ile Val Leu ProGly Arg Ala Ser Arg Leu Leu Asp Leu Val Asp Phe 3170 3175 3180 Ile AlaAsp Leu Asn Asp Ala Gly Gln Gly Arg Thr Ile Asp Phe Val 3185 3190 31953200 Asn Ile Arg Glu Asp Tyr Ser Gly Arg Asp Asp Gly Lys Leu Pro Gln3205 3210 3215 Glu Glu Arg Ala Glu Leu Gln Glu Ala Leu Asn Ala Phe GluGlu Arg 3220 3225 3230 Val Arg Glu Arg Thr Pro Gly Leu His Ile Asp TyrGly Tyr Ala Leu 3235 3240 3245 Asn Ser Leu Arg Thr Gly Ala Asp Ala GluLeu Leu Arg Ile Lys Pro 3250 3255 3260 Ala Thr Met Arg Pro Thr Ala HisPro Gln Val Ala Val Gln Val Asp 3265 3270 3275 3280 Leu Leu Gly Asp ValTyr Leu Tyr Arg Glu Ala Gly Phe Pro Asp Leu 3285 3290 3295 Asp Gly AlaThr Arg Tyr Ile Ala Gly Arg Val Thr Pro Asp Thr Ser 3300 3305 3310 LeuThr Glu Val Val Arg Asp Phe Val Glu Arg Gly Gly Glu Val Ala 3315 33203325 Ala Val Asp Gly Asp Glu Tyr Phe Met Asp Gly Phe Asp Gln Val Val3330 3335 3340 Thr Ala Arg Leu Asn Gln Leu Glu Arg Asp Ala Ala Asp GlyTrp Glu 3345 3350 3355 3360 Glu Ala Arg Gly Phe Leu Arg Met Lys Ser AlaLeu Ser Asp Leu Ala 3365 3370 3375 Phe Phe Gly Gly Pro Ala Ala Phe AspGln Pro Leu Leu Val Gly Arg 3380 3385 3390 Pro Asn Arg Ile Asp Arg AlaArg Leu Tyr Glu Arg Leu Asp Arg Ala 3395 3400 3405 Leu Asp Ser Gln TrpLeu Ser Asn Gly Gly Pro Leu Val Arg Glu Phe 3410 3415 3420 Glu Glu ArgVal Ala Gly Leu Ala Gly Val Arg His Ala Val Ala Thr 3425 3430 3435 3440Cys Asn Ala Thr Ala Gly Leu Gln Leu Leu Ala His Ala Ala Gly Leu 34453450 3455 Thr Gly Glu Val Ile Met Pro Ser Met Thr Phe Ala Ala Thr ProHis 3460 3465 3470 Ala Leu Arg Trp Ile Gly Leu Thr Pro Val Phe Ala AspIle Asp Pro 3475 3480 3485 Asp Thr Gly Asn Leu Asp Pro Asp Gln Val AlaAla Ala Val Thr Pro 3490 3495 3500 Arg Thr Ser Ala Val Val Gly Val HisLeu Trp Gly Arg Pro Cys Ala 3505 3510 3515 3520 Ala Asp Gln Leu Arg LysVal Ala Asp Glu His Gly Leu Arg Leu Tyr 3525 3530 3535 Phe Asp Ala AlaHis Ala Leu Gly Cys Ala Val Asp Gly Arg Pro Ala 3540 3545 3550 Gly SerLeu Gly Asp Ala Glu Val Phe Ser Phe His Ala Thr Lys Ala 3555 3560 3565Val Asn Ala Phe Glu Gly Gly Ala Val Val Thr Asp Asp Ala Asp Leu 35703575 3580 Ala Ala Arg Ile Arg Ala Leu His Asn Phe Gly Phe Asp Leu ProGly 3585 3590 3595 3600 Gly Ser Pro Ala Gly Gly Thr Asn Ala Lys Met SerGlu Ala Ala Ala 3605 3610 3615 Ala Met Gly Leu Thr Ser Leu Asp Ala PhePro Glu Val Ile Asp Arg 3620 3625 3630 Asn Arg Arg Asn His Ala Ala TyrArg Glu His Leu Ala Asp Leu Pro 3635 3640 3645 Gly Val Leu Val Ala AspHis Asp Arg His Gly Leu Asn Asn His Gln 3650 3655 3660 Tyr Val Ile ValGlu Ile Asp Glu Ala Thr Thr Gly Ile His Arg Asp 3665 3670 3675 3680 LeuVal Met Glu Val Leu Lys Ala Glu Gly Val His Thr Arg Ala Tyr 3685 36903695 Phe Ser Pro Gly Cys His Glu Leu Glu Pro Tyr Arg Gly Gln Pro His3700 3705 3710 Ala Pro Leu Pro His Thr Glu Arg Leu Ala Ala Arg Val LeuSer Leu 3715 3720 3725 Pro Thr Gly Thr Ala Ile Gly Asp Asp Asp Ile ArgArg Val Ala Asp 3730 3735 3740 Leu Leu Arg Leu Cys Ala Thr Arg Gly ArgGlu Leu Thr Ala Arg His 3745 3750 3755 3760 Arg Asp Thr Ala Pro Ala ProLeu Ala Ala Pro Gln Thr Ser Thr Pro 3765 3770 3775 Thr Ile Gly Arg SerArg 3780 5 37948 DNA Streptomyces venezuelae 5 gggcccctcc tcacgcgtctcgatcctcgc gcgtccgcgc cttcccgcgc cggcactcgc 60 gctctcgcgt tcgttcaggccctccgcttc cgggtcccgc cggtgcggct cttcgtgctg 120 ctccggctcc gaacggtttcgcggagcaga ctcatggcat tttccccgca gggcggccga 180 cacgagctcg gtcagaacttcctcgtcgac cggtcagtga tcgacgagat cgacggcctg 240 gtggccagga ccaagggtccgatactggag atcggtccgg gtgacggcgc cctgaccctg 300 ccgctgagca ggcacggcaggccgatcacc gccgtcgagc tcgacggccg gcgcgcgcag 360 cgcctcggtg cccgcacccccggtcatgtg accgtggtgc accacgactt cctgcagtac 420 ccgctgccgc gcaacccgcatgtggtcgtc ggcaacgtcc ccttccatct gacgacggcg 480 atcatgcggc ggctgctcgacgcccagcac tggcacaccg ccgtcctcct cgtccagtgg 540 gaggtcgccc ggcgccgggccggcgtcggc gggtcgacgc tgctgacggc cggctgggcg 600 ccctggtacg agttcgacctgcactcccgg gtccccgcgc gggccttccg tccgatgccg 660 ggcgtggacg gaggagtactggccatccgg cggcggtccg cgccgctcgt gggccaggtg 720 aagacgtacc aggacttcgtacgccaggtg ttcaccggca aggggaacgg gctgaaggag 780 atcctgcggc ggaccgggcggatctcgcag cgggacctgg cgacctggct gcggaggaac 840 gagatctcgc cgcacgcgctgcccaaggac ctgaagcccg ggcagtgggc gtcgctgtgg 900 gagctgaccg gcggcacggccgacggatcc ttcgacggta cggcgggcgg tggcgcggcc 960 ggatcgcacg gggcggctcgggtcggggcc ggtcacccgg gcggccgggt gtccgcgagc 1020 cggcggggcg tgccgcaggcgcggcgcggc cgggggcatg cggtacggag ctccacgggg 1080 accgagccga ggtggggcagggggcgggcg gagagcgcgt gagccgttct cgagcctgct 1140 gccgagccgc tgctgagccggtgctgagcc ggatccgacc gtgggtgtga atctccgggt 1200 gctcgcctcg tcctgccccgttacctgtcc gcctcccgct ccagaccagc gggaggcgga 1260 caggggcatg cccgccgggcggctaacggc ccgtgcggcg tccgtacgac gagcctcgcg 1320 cgccctggcg gcccttggtctgccggacct gtgcgcgggg tgcgcagggt tcgccgccgc 1380 gcgtggggcc gtatctgcggctcccgggca cggcggccct gctcgtctcc gagtcatagt 1440 ccctgccgcc ggcgccaccgccctggcccg gcatgcgcgt gccgggcgcc cccggcgcgt 1500 aactcggctg ggaggcctggaaaagggcga tccattgggt gagcgtgagg tccttcggca 1560 gtccgccgtc cggaattccgtggcggtcgg cgagggaacg gtaggtccgc ttggggatgt 1620 ggcgccggag gatctccgcgaggccccgtc cggggccggt gaagacggct tcggcgaagt 1680 tctggaaggc gcggctcgcgctctcgggca gcaggggctg ggggcgtcgc ctgatcgtca 1740 ggacgccgcc gtcgacgcggggcatcggac ggaacgacga ggcgcggacg cggtcgtgga 1800 ccgcgaactc gtaccagggggcccaggagg tcgtgaggag cgatccgccg ctgcgaccgg 1860 cgcgtttgcg ggcgacctcccactgcacta tcagggccgc cgactgccag ttcgtcgatt 1920 ccaggagact ccggagaatctgggtcgtga tgccgaaggg aacgtttccg acgacggtgt 1980 cgatatcgcg cggaatgcggaagtcgagga aatcaccctg gaatacggtg accctctccc 2040 cttcgaattt ccgccgcacatgcgcggccc agtgcgggtc catctccacg accgtcacgg 2100 tgtcgaagga gcgcaccaactcctcggtta tcgcgccctt tccggggccg atttcgagaa 2160 cgttcctacc gtccccctcgacatgcgtga cgagattgcg cacggctctg tcgtcctgaa 2220 ggaagttctg gcctaattcgcggcgaaggg tgtcgcggtc cgctcgcctc ggtatggagt 2280 cgcgcattgc catgaacgatcccctccctg gatgccgtgg tcaatggact tggcacggac 2340 catacctcac ggtccgtcggacgaccggag aagaagttca cgcacgggcg ttccggagta 2400 cgggagttgt gaacggccgcgacgaagtcg gtcgcggctc ggcgggcggt gacgagcgag 2460 gtccggagga acgcgacgaagcagccgaac cccaagtgag gtgcgacgga gtgacattgg 2520 gggcatacgg agggttgtcgtacggagcgc actcaacgag gctccaggag ggaggggttg 2580 aacccgccgc cgactggccttcgccgcccg cgcggccgga gtatgtcatg tcgggggtga 2640 aatcaagcca ttcccccgggatcggctgtt acccatccct ttacctggcg tggatttccc 2700 aacccttggt atagagcgggagacgacgcg acaccatgga gaccacgcac accacgagcg 2760 ccaccccccg gccatcccgacaaggggggt ccggctcgcc tcccgacacc catggcctgg 2820 ggtacacgcc aggtatagggggaacgtagg gggagcatag ggggggtgcc ctggggttgg 2880 gtgaaagcgc ggcttccggagacggagccg gatgtcttca gccggaatta ccaggaccgg 2940 tgcgagaaca ccggtgacagggcgtggggc ggcagcgtgg gacacggggg aagtgcgggt 3000 ccgacggggg ttgccccctgccggccccga tcatgcggag cactccttct ctcgtgctcc 3060 taccggtgat gtgcgcgccgaattgattcg tggagagatg tcgacagtgt ccaagagtga 3120 gtccgaggaa ttcgtgtccgtgtcgaacga cgccggttcc gcgcacggca cagcggaacc 3180 cgtcgccgtc gtcggcatctcctgccgggt gcccggcgcc cgggacccga gagagttctg 3240 ggaactcctg gcggcaggcggccaggccgt caccgacgtc cccgcggacc gctggaacgc 3300 cggcgacttc tacgacccggaccgctccgc ccccggccgc tcgaacagcc ggtggggcgg 3360 gttcatcgag gacgtcgaccggttcgacgc cgccttcttc ggcatctcgc cccgcgaggc 3420 cgcggagatg gacccgcagcagcggctcgc cctggagctg ggctgggagg ccctggagcg 3480 cgccgggatc gacccgtcctcgctcaccgg cacccgcacc ggcgtcttcg ccggcgccat 3540 ctgggacgac tacgccaccctgaagcaccg ccagggcggc gccgcgatca ccccgcacac 3600 cgtcaccggc ctccaccgcggcatcatcgc gaaccgactc tcgtacacgc tcgggctccg 3660 cggccccagc atggtcgtcgactccggcca gtcctcgtcg ctcgtcgccg tccacctcgc 3720 gtgcgagagc ctgcggcgcggcgagtccga gctcgccctc gccggcggcg tctcgctcaa 3780 cctggtgccg gacagcatcatcggggcgag caagttcggc ggcctctccc ccgacggccg 3840 cgcctacacc ttcgacgcgcgcgccaacgg ctacgtacgc ggcgagggcg gcggtttcgt 3900 cgtcctgaag cgcctctcccgggccgtcgc cgacggcgac ccggtgctcg ccgtgatccg 3960 gggcagcgcc gtcaacaacggcggcgccgc ccagggcatg acgacccccg acgcgcaggc 4020 gcaggaggcc gtgctccgcgaggcccacga gcgggccggg accgcgccgg ccgacgtgcg 4080 gtacgtcgag ctgcacggcaccggcacccc cgtgggcgac ccgatcgagg ccgctgcgct 4140 cggcgccgcc ctcggcaccggccgcccggc cggacagccg ctcctggtcg gctcggtcaa 4200 gacgaacatc ggccacctggagggcgcggc cggcatcgcc ggcctcatca aggccgtcct 4260 ggcggtccgc ggtcgcgcgctgcccgccag cctgaactac gagaccccga acccggcgat 4320 cccgttcgag gaactgaacctccgggtgaa cacggagtac ctgccgtggg agccggagca 4380 cgacgggcag cggatggtcgtcggcgtgtc ctcgttcggc atgggcggca cgaacgcgca 4440 tgtcgtgctc gaagaggcccccgggggttg tcgaggtgct tcggtcgtgg agtcgacggt 4500 cggcgggtcg gcggtcggcggcggtgtggt gccgtgggtg gtgtcggcga agtccgctgc 4560 cgcgctggac gcgcagatcgagcggcttgc cgcgttcgcc tcgcgggatc gtacggatgg 4620 tgtcgacgcg ggcgctgtcgatgcgggtgc tgtcgatgcg ggtgctgtcg ctcgcgtact 4680 ggccggcggg cgtgctcagttcgagcaccg ggccgtcgtc gtcggcagcg ggccggacga 4740 tctggcggca gcgctggccgcgcctgaggg tctggtccgg ggcgtggctt ccggtgtcgg 4800 gcgagtggcg ttcgtgttccccgggcaggg cacgcagtgg gccggcatgg gtgccgaact 4860 gctggactct tccgcggtgttcgcggcggc catggccgaa tgcgaggccg cactctcccc 4920 gtacgtcgac tggtcgctggaggccgtcgt acggcaggcc cccggtgcgc ccacgctgga 4980 gcgggtcgat gtcgtgcagcctgtgacgtt cgccgtcatg gtctcgctgg ctcgcgtgtg 5040 gcagcaccac ggggtgacgccccaggcggt cgtcggccac tcgcagggcg agatcgccgc 5100 cgcgtacgtc gccggtgccctgagcctgga cgacgccgct cgtgtcgtga ccctgcgcag 5160 caagtccatc gccgcccacctcgccggcaa gggcggcatg ctgtccctcg cgctgagcga 5220 ggacgccgtc ctggagcgactggccgggtt cgacgggctg tccgtcgccg ctgtgaacgg 5280 gcccaccgcc accgtggtctccggtgaccc cgtacagatc gaagagcttg ctcgggcgtg 5340 tgaggccgat ggggtccgtgcgcgggtcat tcccgtcgac tacgcgtccc acagccggca 5400 ggtcgagatc atcgagagcgagctcgccga ggtcctcgcc gggctcagcc cgcaggctcc 5460 gcgcgtgccg ttcttctcgacactcgaagg cgcctggatc accgagcccg tgctcgacgg 5520 cggctactgg taccgcaacctgcgccatcg tgtgggcttc gccccggccg tcgagaccct 5580 ggccaccgac gagggcttcacccacttcgt cgaggtcagc gcccaccccg tcctcaccat 5640 ggccctcccc gggaccgtcaccggtctggc gaccctgcgt cgcgacaacg gcggtcagga 5700 ccgcctagtc gcctccctcgccgaagcatg ggccaacgga ctcgcggtcg actggagccc 5760 gctcctcccc tccgcgaccggccaccactc cgacctcccc acctacgcgt tccagaccga 5820 gcgccactgg ctgggcgagatcgaggcgct cgccccggcg ggcgagccgg cggtgcagcc 5880 cgccgtcctc cgcacggaggcggccgagcc ggcggagctc gaccgggacg agcagctgcg 5940 cgtgatcctg gacaaggtccgggcgcagac ggcccaggtg ctggggtacg cgacaggcgg 6000 gcagatcgag gtcgaccggaccttccgtga ggccggttgc acctccctga ccggcgtgga 6060 cctgcgcaac cggatcaacgccgccttcgg cgtacggatg gcgccgtcca tgatcttcga 6120 cttccccacc cccgaggctctcgcggagca gctgctcctc gtcgtgcacg gggaggcggc 6180 ggcgaacccg gccggtgcggagccggctcc ggtggcggcg gccggtgccg tcgacgagcc 6240 ggtggcgatc gtcggcatggcctgccgcct gcccggtggg gtcgcctcgc cggaggacct 6300 gtggcggctg gtggccggcggcggggacgc gatctcggag ttcccgcagg accgcggctg 6360 ggacgtggag gggctgtaccacccggatcc ggagcacccc ggcacgtcgt acgtccgcca 6420 gggcggtttc atcgagaacgtcgccggctt cgacgcggcc ttcttcggga tctcgccgcg 6480 cgaggccctc gccatggacccgcagcagcg gctcctcctc gaaacctcct gggaggccgt 6540 cgaggacgcc gggatcgacccgacctccct gcggggacgg caggtcggcg tcttcactgg 6600 ggcgatgacc cacgagtacgggccgagcct gcgggacggc ggggaaggcc tcgacggcta 6660 cctgctgacc ggcaacacggccagcgtgat gtcgggccgc gtctcgtaca cactcggcct 6720 tgagggcccc gccctgacggtggacacggc ctgctcgtcg tcgctggtcg ccctgcacct 6780 cgccgtgcag gccctgcgcaagggcgaggt cgacatggcg ctcgccggcg gcgtggccgt 6840 gatgcccacg cccgggatgttcgtcgagtt cagccggcag cgcgggctgg ccggggacgg 6900 ccggtcgaag gcgttcgccgcgtcggcgga cggcaccagc tggtccgagg gcgtcggcgt 6960 cctcctcgtc gagcgcctgtcggacgcccg ccgcaacgga caccaggtcc tcgcggtcgt 7020 ccgcggcagc gccttgaaccaggacggcgc gagcaacggc ctcacggctc cgaacgggcc 7080 ctcgcagcag cgcgtcatccggcgcgcgct ggcggacgcc cggctgacga cctccgacgt 7140 ggacgtcgtc gaggcacacggcacgggcac gcgactcggc gacccgatcg aggcgcaggc 7200 cctgatcgcc acctacggccagggccgtga cgacgaacag ccgctgcgcc tcgggtcgtt 7260 gaagtccaac atcgggcacacccaggccgc ggccggcgtc tccggtgtca tcaagatggt 7320 ccaggcgatg cgccacggactgctgccgaa gacgctgcac gtcgacgagc cctcggacca 7380 gatcgactgg tcggctggcgccgtggaact cctcaccgag gccgtcgact ggccggagaa 7440 gcaggacggc gggctgcgccgggccgccgt ctcctccttc gggatcagcg gcaccaatgc 7500 gcatgtggtg ctcgaagaggccccggtggt tgtcgagggt gcttcggtcg tcgagccgtc 7560 ggttggcggg tcggcggtcggcggcggtgt gacgccttgg gtggtgtcgg cgaagtccgc 7620 tgccgcgctc gacgcgcagatcgagcggct tgccgcattc gcctcgcggg atcgtacgga 7680 tgacgccgac gccggtgctgtcgacgcggg cgctgtcgct cacgtactgg ctgacgggcg 7740 tgctcagttc gagcaccgggccgtcgcgct cggcgccggg gcggacgacc tcgtacaggc 7800 gctggccgat ccggacgggctgatacgcgg aacggcttcc ggtgtcgggc gagtggcgtt 7860 cgtgttcccc ggtcagggcacgcagtgggc tggcatgggt gccgaactgc tggactcttc 7920 cgcggtgttc gcggcggccatggccgagtg tgaggccgcg ctgtccccgt acgtcgactg 7980 gtcgctggag gccgtcgtacggcaggcccc cggtgcgccc acgctggagc gggtcgatgt 8040 cgtgcagcct gtgacgttcgccgtcatggt ctcgctggct cgcgtgtggc agcaccacgg 8100 tgtgacgccc caggcggtcgtcggccactc gcagggcgag atcgccgccg cgtacgtcgc 8160 cggagccctg cccctggacgacgccgcccg cgtcgtcacc ctgcgcagca agtccatcgc 8220 cgcccacctc gccggcaagggcggcatgct gtccctcgcg ctgaacgagg acgccgtcct 8280 ggagcgactg agtgacttcgacgggctgtc cgtcgccgcc gtcaacgggc ccaccgccac 8340 tgtcgtgtcg ggtgaccccgtacagatcga agagcttgct caggcgtgca aggcggacgg 8400 attccgcgcg cggatcattcccgtcgacta cgcgtcccac agccggcagg tcgagatcat 8460 cgagagcgag ctcgcccaggtcctcgccgg tctcagcccg caggccccgc gcgtgccgtt 8520 cttctcgacg ctcgaaggcacctggatcac cgagcccgtc ctcgacggca cctactggta 8580 ccgcaacctc cgtcaccgcgtcggcttcgc ccccgccatc gagaccctgg ccgtcgacga 8640 gggcttcacg cacttcgtcgaggtcagcgc ccaccccgtc ctcaccatga ccctccccga 8700 gaccgtcacc ggcctcggcaccctccgtcg cgaacaggga ggccaagagc gtctggtcac 8760 ctcgctcgcc gaggcgtgggtcaacgggct tcccgtggca tggacttcgc tcctgcccgc 8820 cacggcctcc cgccccggtctgcccaccta cgccttccag gccgagcgct actggctcga 8880 gaacactccc gccgccctggccaccggcga cgactggcgc taccgcatcg actggaagcg 8940 cctcccggcc gccgaggggtccgagcgcac cggcctgtcc ggccgctggc tcgccgtcac 9000 gccggaggac cactccgcgcaggccgccgc cgtgctcacc gcgctggtcg acgccggggc 9060 gaaggtcgag gtgctgacggccggggcgga cgacgaccgt gaggccctcg ccgcccggct 9120 caccgcactg acgaccggtgacggcttcac cggcgtggtc tcgctcctcg acggactcgt 9180 accgcaggtc gcctgggtccaggcgctcgg cgacgccgga atcaaggcgc ccctgtggtc 9240 cgtcacccag ggcgcggtctccgtcggacg tctcgacacc cccgccgacc ccgaccgggc 9300 catgctctgg ggcctcggccgcgtcgtcgc ccttgagcac cccgaacgct gggccggcct 9360 cgtcgacctc cccgcccagcccgatgccgc cgccctcgcc cacctcgtca ccgcactctc 9420 cggcgccacc ggcgaggaccagatcgccat ccgcaccacc ggactccacg cccgccgcct 9480 cgcccgcgca cccctccacggacgtcggcc cacccgcgac tggcagcccc acggcaccgt 9540 cctcatcacc ggcggcaccggagccctcgg cagccacgcc gcacgctgga tggcccacca 9600 cggagccgaa cacctcctcctcgtcagccg cagcggcgaa caagcccccg gagccaccca 9660 actcaccgcc gaactcaccgcatcgggcgc ccgcgtcacc atcgccgcct gcgacgtcgc 9720 cgacccccac gccatgcgcaccctcctcga cgccatcccc gccgagacgc ccctcaccgc 9780 cgtcgtccac accgccggcgcgctcgacga cggcatcgtg gacacgctga ccgccgagca 9840 ggtccggcgg gcccaccgtgcgaaggccgt cggcgcctcg gtgctcgacg agctgacccg 9900 ggacctcgac ctcgacgcgttcgtgctctt ctcgtccgtg tcgagcactc tgggcatccc 9960 cggtcagggc aactacgccccgcacaacgc ctacctcgac gccctcgcgg ctcgccgccg 10020 ggccaccggc cggtccgccgtctcggtggc ctggggaccg tgggacggtg gcggcatggc 10080 cgccggtgac ggcgtggccgagcggctgcg caaccacggc gtgcccggca tggacccgga 10140 actcgccctg gccgcactggagtccgcgct cggccgggac gagaccgcga tcaccgtcgc 10200 ggacatcgac tgggaccgcttctacctcgc gtactcctcc ggtcgcccgc agcccctcgt 10260 cgaggagctg cccgaggtgcggcgcatcat cgacgcacgg gacagcgcca cgtccggaca 10320 gggcgggagc tccgcccagggcgccaaccc cctggccgag cggctggccg ccgcggctcc 10380 cggcgagcgt acggagatcctcctcggtct cgtacgggcg caggccgccg ccgtgctccg 10440 gatgcgttcg ccggaggacgtcgccgccga ccgcgccttc aaggacatcg gcttcgactc 10500 gctcgccggt gtcgagctgcgcaacaggct gacccgggcg accgggctcc agctgcccgc 10560 gacgctcgtc ttcgaccacccgacgccgct ggccctcgtg tcgctgctcc gcagcgagtt 10620 cctcggtgac gaggagacggcggacgcccg gcggtccgcg gcgctgcccg cgactgtcgg 10680 tgccggtgcc ggcgccggcgccggcaccga tgccgacgac gatccgatcg cgatcgtcgc 10740 gatgagctgc cgctaccccggtgacatccg cagcccggag gacctgtggc ggatgctgtc 10800 cgagggcggc gagggcatcacgccgttccc caccgaccgc ggctgggacc tcgacggcct 10860 gtacgacgcc gacccggacgcgctcggcag ggcgtacgtc cgcgagggcg ggttcctgca 10920 cgacgcggcc gagttcgacgcggagttctt cggcgtctcg ccgcgcgagg cgctggccat 10980 ggacccgcag cagcggatgctcctgacgac gtcctgggag gccttcgagc gggccggcat 11040 cgagccggca tcgctgcgcggcagcagcac cggtgtcttc atcggcctct cctaccagga 11100 ctacgcggcc cgcgtcccgaacgccccgcg tggcgtggag ggttacctgc tgaccggcag 11160 cacgccgagc gtcgcgtcgggccgtatcgc gtacaccttc ggtctcgaag ggcccgcgac 11220 gaccgtcgac accgcctgctcgtcgtcgct gaccgccctg cacctggcgg tgcgggcgct 11280 gcgcagcggc gagtgcacgatggcgctcgc cggtggcgtg gcgatgatgg cgaccccgca 11340 catgttcgtg gagttcagccgtcagcgggc gctcgccccg gacggccgca gcaaggcctt 11400 ctcggcggac gccgacgggttcggcgccgc ggagggcgtc ggcctgctgc tcgtggagcg 11460 gctctcggac gcgcggcgcaacggtcaccc ggtgctcgcc gtggtccgcg gtaccgccgt 11520 caaccaggac ggcgccagcaacgggctgac cgcgcccaac ggaccctcgc agcagcgggt 11580 gatccggcag gcgctcgccgacgcccggct ggcacccggc gacatcgacg ccgtcgagac 11640 gcacggcacg ggaacctcgctgggcgaccc catcgaggcc cagggcctcc aggccacgta 11700 cggcaaggag cggcccgcggaacggccgct cgccatcggc tccgtgaagt ccaacatcgg 11760 acacacccag gccgcggccggtgcggcggg catcatcaag atggtcctcg cgatgcgcca 11820 cggcaccctg ccgaagaccctccacgccga cgagccgagc ccgcacgtcg actgggcgaa 11880 cagcggcctg gccctcgtcaccgagccgat cgactggccg gccggcaccg gtccgcgccg 11940 cgccgccgtc tcctccttcggcatcagcgg gacgaacgcg cacgtcgtgc tggagcaggc 12000 gccggatgct gctggtgaggtgcttggggc cgatgaggtg cctgaggtgt ctgagacggt 12060 agcgatggct gggacggctgggacctccga ggtcgctgag ggctctgagg cctccgaggc 12120 ccccgcggcc cccggcagccgtgaggcgtc cctccccggg cacctgccct gggtgctgtc 12180 cgccaaggac gagcagtcgctgcgcggcca ggccgccgcc ctgcacgcgt ggctgtccga 12240 gcccgccgcc gacctgtcggacgcggacgg accggcccgc ctgcgggacg tcgggtacac 12300 gctcgccacg agccgtaccgccttcgcgca ccgcgccgcc gtgaccgccg ccgaccggga 12360 cgggttcctg gacgggctggccacgctggc ccagggcggc acctcggccc acgtccacct 12420 ggacaccgcc cgggacggcaccaccgcgtt cctcttcacc ggccagggca gtcagcgccc 12480 cggcgccggc cgtgagctgtacgaccggca ccccgtcttc gcccgggcgc tcgacgagat 12540 ctgcgcccac ctcgacggtcacctcgaact gcccctgctc gacgtgatgt tcgcggccga 12600 gggcagcgcg gaggccgcgctgctcgacga gacgcggtac acgcagtgcg cgctgttcgc 12660 cctggaggtc gcgctcttccggctcgtcga gagctggggc atgcggccgg ccgcactgct 12720 cggtcactcg gtcggcgagatcgccgccgc gcacgtcgcc ggtgtgttct cgctcgccga 12780 cgccgcccgc ctggtcgccgcgcgcggccg gctcatgcag gagctgcccg ccggtggcgc 12840 gatgctcgcc gtccaggccgcggaggacga gatccgcgtg tggctggaga cggaggagcg 12900 gtacgcggga cgtctggacgtcgccgccgt caacggcccc gaggccgccg tcctgtccgg 12960 cgacgcggac gcggcgcgggaggcggaggc gtactggtcc gggctcggcc gcaggacccg 13020 cgcgctgcgg gtcagccacgccttccactc cgcgcacatg gacggcatgc tcgacgggtt 13080 ccgcgccgtc ctggagacggtggagttccg gcgcccctcc ctgaccgtgg tctcgaacgt 13140 caccggcctg gccgccggcccggacgacct gtgcgacccc gagtactggg tccggcacgt 13200 ccgcggcacc gtccgcttcctcgacggcgt ccgtgtcctg cgcgacctcg gcgtgcggac 13260 ctgcctggag ctgggccccgacggggtcct caccgccatg gcggccgacg gcctcgcgga 13320 cacccccgcg gattccgctgccggctcccc cgtcggctct cccgccggct ctcccgccga 13380 ctccgccgcc ggcgcgctccggccccggcc gctgctcgtg gcgctgctgc gccgcaagcg 13440 gtcggagacc gagaccgtcgcggacgccct cggcagggcg cacgcccacg gcaccggacc 13500 cgactggcac gcctggttcgccggctccgg ggcgcaccgc gtggacctgc ccacgtactc 13560 cttccggcgc gaccgctactggctggacgc cccggcggcc gacaccgcgg tggacaccgc 13620 cggcctcggt ctcggcaccgccgaccaccc gctgctcggc gccgtggtca gccttccgga 13680 ccgggacggc ctgctgctcaccggccgcct ctccctgcgc acccacccgt ggctcgcgga 13740 ccacgccgtc ctggggagcgtcctgctccc cggcgccgcg atggtcgaac tcgccgcgca 13800 cgctgcggag tccgccggtctgcgtgacgt gcgggagctg accctccttg aaccgctggt 13860 actgcccgag cacggtggcgtcgagctgcg cgtgacggtc ggggcgccgg ccggagagcc 13920 cggtggcgag tcggccggggacggcgcacg gcccgtctcc ctccactcgc ggctcgccga 13980 cgcgcccgcc ggtaccgcctggtcctgcca cgcgaccggt ctgctggcca ccgaccggcc 14040 cgagcttccc gtcgcgcccgaccgtgcggc catgtggccg ccgcagggcg ccgaggaggt 14100 gccgctcgac ggtctctacgagcggctcga cgggaacggc ctcgccttcg gtccgctgtt 14160 ccaggggctg aacgcggtgtggcggtacga gggtgaggtc ttcgccgaca tcgcgctccc 14220 cgccaccacg aatgcgaccgcgcccgcgac cgcgaacggc ggcgggagtg cggcggcggc 14280 cccctacggc atccaccccgccctgctcga cgcttcgctg cacgccatcg cggtcggcgg 14340 tctcgtcgac gagcccgagctcgtccgcgt ccccttccac tggagcggtg tcaccgtgca 14400 cgcggccggt gccgcggcggcccgggtccg tctcgcctcc gcggggacgg acgccgtctc 14460 gctgtccctg acggacggcgagggacgccc gctggtctcc gtggaacggc tcacgctgcg 14520 cccggtcacc gccgatcaggcggcggcgag ccgcgtcggc gggctgatgc accgggtggc 14580 ctggcgtccg tacgccctcgcctcgtccgg cgaacaggac ccgcacgcca cttcgtacgg 14640 gccgaccgcc gtcctcggcaaggacgagct gaaggtcgcc gccgccctgg agtccgcggg 14700 cgtcgaagtc gggctctaccccgacctggc cgcgctgtcc caggacgtgg cggccggcgc 14760 cccggcgccc cgtaccgtccttgcgccgct gcccgcgggt cccgccgacg gcggcgcgga 14820 gggtgtacgg ggcacggtggcccggacgct ggagctgctc caggcctggc tggccgacga 14880 gcacctcgcg ggcacccgcctgctcctggt cacccgcggt gcggtgcggg accccgaggg 14940 gtccggcgcc gacgatggcggcgaggacct gtcgcacgcg gccgcctggg gtctcgtacg 15000 gaccgcgcag accgagaaccccggccgctt cggccttctc gacctggccg acgacgcctc 15060 gtcgtaccgg accctgccgtcggtgctctc cgacgcgggc ctgcgcgacg aaccgcagct 15120 cgccctgcac gacggcaccatcaggctggc ccgcctggcc tccgtccggc ccgagaccgg 15180 caccgccgca ccggcgctcgccccggaggg cacggtcctg ctgaccggcg gcaccggcgg 15240 cctgggcgga ctggtcgcccggcacgtggt gggcgagtgg ggcgtacgac gcctgctgct 15300 ggtgagccgg cggggcacggacgccccggg cgccgacgag ctcgtgcacg agctggaggc 15360 cctgggagcc gacgtctcggtggccgcgtg cgacgtcgcc gaccgcgaag ccctcaccgc 15420 cgtactcgac gccatccccgccgaacaccc gctcaccgcg gtcgtccaca cggcaggcgt 15480 cctctccgac ggcaccctcccgtccatgac gacggaggac gtggaacacg tactgcggcc 15540 caaggtcgac gccgcgttcctcctcgacga actcacctcg acgcccgcat acgacctggc 15600 agcgttcgtc atgttctcctccgccgccgc cgtcttcggt ggcgcggggc agggcgccta 15660 cgccgccgcc aacgccaccctcgacgccct cgcctggcgc cgccgggcag ccggactccc 15720 cgccctctcc ctcggctggggcctctgggc cgagaccagc ggcatgaccg gcgagctcgg 15780 ccaggcggac ctgcgccggatgagccgcgc gggcatcggc gggatcagcg acgccgaggg 15840 catcgcgctc ctcgacgccgccctccgcga cgaccgccac ccggtcctgc tgcccctgcg 15900 gctcgacgcc gccgggctgcgggacgcggc cgggaacgac ccggccggaa tcccggcgct 15960 cttccgggac gtcgtcggcgccaggaccgt ccgggcccgg ccgtccgcgg cctccgcctc 16020 gacgacagcc gggacggccggcacgccggg gacggcggac ggcgcggcgg aaacggcggc 16080 ggtcacgctc gccgaccgggccgccaccgt ggacgggccc gcacggcagc gcctgctgct 16140 cgagttcgtc gtcggcgaggtcgccgaagt actcggccac gcccgcggtc accggatcga 16200 cgccgaacgg ggcttcctcgacctcggctt cgactccctg accgccgtcg aactccgcaa 16260 ccggctcaac tccgccggtggcctcgccct cccggcgacc ctggtcttcg accacccaag 16320 cccggcggca ctcgcctcccacctggacgc cgagctgccg cgcggcgcct cggaccagga 16380 cggagccggg aaccggaacgggaacgagaa cgggacgacg gcgtcccgga gcaccgccga 16440 gacggacgcg ctgctggcacaactgacccg cctggaaggc gccttggtgc tgacgggcct 16500 ctcggacgcc cccgggagcgaagaagtcct ggagcacctg cggtccctgc gctcgatggt 16560 cacgggcgag accgggaccgggaccgcgtc cggagccccg gacggcgccg ggtccggcgc 16620 cgaggaccgg ccctgggcggccggggacgg agccgggggc gggagtgagg acggcgcggg 16680 agtgccggac ttcatgaacgcctcggccga ggaactcttc ggcctcctcg accaggaccc 16740 cagcacggac tgatccctgccgcacggtcg cctcccgccc cggaccccgt cccgggcacc 16800 tcgactcgaa tcacttcatgcgcgcctcgg gcgcctccag gaactcaagg ggacagcgtg 16860 tccacggtga acgaagagaagtacctcgac tacctgcgtc gtgccacggc ggacctccac 16920 gaggcccgtg gccgcctccgcgagctggag gcgaaggcgg gcgagccggt ggcgatcgtc 16980 ggcatggcct gccgcctgcccggcggcgtc gcctcgcccg aggacctgtg gcggctggtg 17040 gccggcggcg aggacgcgatctcggagttc ccccaggacc gcggctggga cgtggagggc 17100 ctgtacgacc cgaacccggaggccacgggc aagagttacg cccgcgaggc cggattcctg 17160 tacgaggcgg gcgagttcgacgccgacttc ttcgggatct cgccgcgcga ggccctcgcc 17220 atggacccgc agcagcgtctcctcctggag gcctcctggg aggcgttcga gcacgccggg 17280 atcccggcgg ccaccgcgcgcggcacctcg gtcggcgtct tcaccggcgt gatgtaccac 17340 gactacgcca cccgtctcaccgatgtcccg gagggcatcg agggctacct gggcaccggc 17400 aactccggca gtgtcgcctcgggccgcgtc gcgtacacgc ttggcctgga ggggccggcc 17460 gtcacggtcg acaccgcctgctcgtcctcg ctggtcgccc tgcacctcgc cgtgcaggcc 17520 ctgcgcaagg gcgaggtcgacatggcgctc gccggcggcg tgacggtcat gtcgacgccc 17580 agcaccttcg tcgagttcagccgtcagcgc gggctggcgc cggacggccg gtcgaagtcc 17640 ttctcgtcga cggccgacggcaccagctgg tccgagggcg tcggcgtcct cctcgtcgag 17700 cgcctgtccg acgcgcgtcgcaagggccat cggatcctcg ccgtggtccg gggcaccgcc 17760 gtcaaccagg acggcgccagcagcggcctc acggctccga acgggccgtc gcagcagcgc 17820 gtcatccgac gtgccctggcggacgcccgg ctcacgacct ccgacgtgga cgtcgtcgag 17880 gcccacggca cgggtacgcgactcggcgac ccgatcgagg cgcaggccgt catcgccacg 17940 tacgggcagg gccgtgacggcgaacagccg ctgcgcctcg ggtcgttgaa gtccaacatc 18000 ggacacaccc aggccgccgccggtgtctcc ggcgtgatca agatggtcca ggcgatgcgc 18060 cacggcgtcc tgccgaagacgctccacgtg gagaagccga cggaccaggt ggactggtcc 18120 gcgggcgcgg tcgagctgctcaccgaggcc atggactggc cggacaaggg cgacggcgga 18180 ctgcgcaggg ccgcggtctcctccttcggc gtcagcggga cgaacgcgca cgtcgtgctc 18240 gaagaggccc cggcggccgaggagacccct gcctccgagg cgaccccggc cgtcgagccg 18300 tcggtcggcg ccggcctggtgccgtggctg gtgtcggcga agactccggc cgcgctggac 18360 gcccagatcg gacgcctcgccgcgttcgcc tcgcagggcc gtacggacgc cgccgatccg 18420 ggcgcggtcg ctcgcgtactggccggcggg cgcgccgagt tcgagcaccg ggccgtcgtg 18480 ctcggcaccg gacaggacgatttcgcgcag gcgctgaccg ctccggaagg actgatacgc 18540 ggcacgccct cggacgtgggccgggtggcg ttcgtgttcc ccggtcaggg cacgcagtgg 18600 gccgggatgg gcgccgaactcctcgacgtg tcgaaggagt tcgcggcggc catggccgag 18660 tgcgagagcg cgctctcccgctatgtcgac tggtcgctgg aggccgtcgt ccggcaggcg 18720 ccgggcgcgc ccacgctggagcgggtcgac gtcgtccagc ccgtgacctt cgctgtcatg 18780 gtttcgctgg cgaaggtctggcagcaccac ggcgtgacgc cgcaggccgt cgtcggccac 18840 tcgcagggcg agatcgccgccgcgtacgtc gccggtgccc tcaccctcga cgacgccgcc 18900 cgcgtcgtca ccctgcgcagcaagtccatc gccgcccacc tcgccggcaa gggcggcatg 18960 atctccctcg ccctcagcgaggaagccacc cggcagcgca tcgagaacct ccacggactg 19020 tcgatcgccg ccgtcaacggccccaccgcc accgtggttt cgggcgaccc cacccagatc 19080 caagagctcg ctcaggcgtgtgaggccgac ggggtccgcg cacggatcat ccccgtcgac 19140 tacgcctccc acagcgcccacgtcgagacc atcgagagcg aactcgccga ggtcctcgcc 19200 gggctcagcc cgcggacacctgaggtgccg ttcttctcga cactcgaagg cgcctggatc 19260 accgagccgg tgctcgacggcacctactgg taccgcaacc tccgccaccg cgtcggcttc 19320 gcccccgccg tcgagaccctcgccaccgac gaaggcttca cccacttcat cgaggtcagc 19380 gcccaccccg tcctcaccatgaccctcccc gagaccgtca ccggcctcgg caccctccgc 19440 cgcgaacagg gaggccaggagcgtctggtc acctcactcg ccgaagcctg gaccaacggc 19500 ctcaccatcg actgggcgcccgtcctcccc accgcaaccg gccaccaccc cgagctcccc 19560 acctacgcct tccagcgccgtcactactgg ctccacgact cccccgccgt ccagggctcc 19620 gtgcaggact cctggcgctaccgcatcgac tggaagcgcc tcgcggtcgc cgacgcgtcc 19680 gagcgcgccg ggctgtccgggcgctggctc gtcgtcgtcc ccgaggaccg ttccgccgag 19740 gccgccccgg tgctcgccgcgctgtccggc gccggcgccg accccgtaca gctggacgtg 19800 tccccgctgg gcgaccggcagcggctcgcc gcgacgctgg gcgaggccct ggcggcggcc 19860 ggtggagccg tcgacggcgtcctctcgctg ctcgcgtggg acgagagcgc gcaccccggc 19920 caccccgccc ccttcacccggggcaccggc gccaccctca ccctggtgca ggcgctggag 19980 gacgccggcg tcgccgccccgctgtggtgc gtgacccacg gcgcggtgtc cgtcggccgg 20040 gccgaccacg tcacctcccccgcccaggcc atggtgtggg gcatgggccg ggtcgccgcc 20100 ctggagcacc ccgagcggtggggcggcctg atcgacctgc cctcggacgc cgaccgggcg 20160 gccctggacc gcatgaccacggtcctcgcc ggcggtacgg gtgaggacca ggtcgcggta 20220 cgcgcctccg ggctgctcgcccgccgcctc gtccgcgcct ccctcccggc gcacggcacg 20280 gcttcgccgt ggtggcaggccgacggcacg gtgctcgtca ccggtgccga ggagcctgcg 20340 gccgccgagg ccgcacgccggctggcccgc gacggcgccg gacacctcct cctccacacc 20400 accccctccg gcagcgaaggcgccgaaggc acctccggtg ccgccgagga ctccggcctc 20460 gccgggctcg tcgccgaactcgcggacctg ggcgcgacgg ccaccgtcgt gacctgcgac 20520 ctcacggacg cggaggcggccgcccggctg ctcgccggcg tctccgacgc gcacccgctc 20580 agcgccgtcc tccacctgccgcccaccgtc gactccgagc cgctcgccgc gaccgacgcg 20640 gacgcgctcg cccgtgtcgtgaccgcgaag gccaccgccg cgctccacct ggaccgcctc 20700 ctgcgggagg ccgcggctgccggaggccgt ccgcccgtcc tggtcctctt ctcctcggtc 20760 gccgcgatct ggggcggcgccggtcagggc gcgtacgccg ccggtacggc cttcctcgac 20820 gccctcgccg gtcagcaccgggccgacggc cccaccgtga cctcggtggc ctggagcccc 20880 tgggagggca gccgcgtcaccgagggtgcg accggggagc ggctgcgccg cctcggcctg 20940 cgccccctcg cccccgcgacggcgctcacc gccctggaca ccgcgctcgg ccacggcgac 21000 accgccgtca cgatcgccgacgtcgactgg tcgagcttcg cccccggctt caccacggcc 21060 cggccgggca ccctcctcgccgatctgccc gaggcgcgcc gcgcgctcga cgagcagcag 21120 tcgacgacgg ccgccgacgacaccgtcctg agccgcgagc tcggtgcgct caccggcgcc 21180 gaacagcagc gccgtatgcaggagttggtc cgcgagcacc tcgccgtggt cctcaaccac 21240 ccctcccccg aggccgtcgacacggggcgg gccttccgtg acctcggatt cgactcgctg 21300 acggcggtcg agctccgcaaccgcctcaag aacgccaccg gcctggccct cccggccact 21360 ctggtcttcg actacccgaccccccggacg ctggcggagt tcctcctcgc ggagatcctg 21420 ggcgagcagg ccggtgccggcgagcagctt ccggtggacg gcggggtcga cgacgagccc 21480 gtcgcgatcg tcggcatggcgtgccgcctg ccgggcggtg tcgcctcgcc ggaggacctg 21540 tggcggctgg tggccggcggcgaggacgcg atctccggct tcccgcagga ccgcggctgg 21600 gacgtggagg ggctgtacgacccggacccg gacgcgtccg ggcggacgta ctgccgtgcc 21660 ggtggcttcc tcgacgaggcgggcgagttc gacgccgact tcttcgggat ctcgccgcgc 21720 gaggccctcg ccatggacccgcagcagcgg ctcctcctgg agacctcctg ggaggccgtc 21780 gaggacgccg ggatcgacccgacctccctt caggggcagc aggtcggcgt gttcgcgggc 21840 accaacggcc cccactacgagccgctgctc cgcaacaccg ccgaggatct tgagggttac 21900 gtcgggacgg gcaacgccgccagcatcatg tcgggccgtg tctcgtacac cctcggcctg 21960 gagggcccgg ccgtcacggtcgacaccgcc tgctcctcct cgctggtcgc cctgcacctc 22020 gccgtgcagg ccctgcgcaagggcgaatgc ggactggcgc tcgcgggcgg tgtgacggtc 22080 atgtcgacgc ccacgacgttcgtggagttc agccggcagc gcgggctcgc ggaggacggc 22140 cggtcgaagg cgttcgccgcgtcggcggac ggcttcggcc cggcggaggg cgtcggcatg 22200 ctcctcgtcg agcgcctgtcggacgcccgc cgcaacggac accgtgtgct ggcggtcgtg 22260 cgcggcagcg cggtcaaccaggacggcgcg agcaacggcc tgaccgcccc gaacgggccc 22320 tcgcagcagc gcgtcatccggcgcgcgctc gcggacgccc gactgacgac cgccgacgtg 22380 gacgtcgtcg aggcccacggcacgggcacg cgactcggcg acccgatcga ggcacaggcc 22440 ctcatcgcca cctacggccaggggcgcgac accgaacagc cgctgcgcct ggggtcgttg 22500 aagtccaaca tcggacacacccaggccgcc gccggtgtct ccggcatcat caagatggtc 22560 caggcgatgc gccacggcgtcctgccgaag acgctccacg tggaccggcc gtcggaccag 22620 atcgactggt cggcgggcacggtcgagctg ctcaccgagg ccatggactg gccgaggaag 22680 caggagggcg ggctgcgccgcgcggccgtc tcctccttcg gcatcagcgg cacgaacgcg 22740 cacatcgtgc tcgaagaagccccggtcgac gaggacgccc cggcggacga gccgtcggtc 22800 ggcggtgtgg tgccgtggctcgtgtccgcg aagactccgg ccgcgctgga cgcccagatc 22860 ggacgcctcg ccgcgttcgcctcgcagggc cgtacggacg ccgccgatcc gggcgcggtc 22920 gctcgcgtac tggccggcgggcgtgcgcag ttcgagcacc gggccgtcgc gctcggcacc 22980 ggacaggacg acctggcggccgcactggcc gcgcctgagg gtctggtccg gggtgtggcc 23040 tccggtgtgg gtcgagtggcgttcgtgttc ccgggacagg gcacgcagtg ggccgggatg 23100 ggtgccgaac tcctcgacgtgtcgaaggag ttcgcggcgg ccatggccga gtgcgaggcc 23160 gcgctcgctc cgtacgtggactggtcgctg gaggccgtcg tccgacaggc ccccggcgcg 23220 cccacgctgg agcgggtcgatgtcgtccag cccgtgacgt tcgccgtcat ggtctcgctg 23280 gcgaaggtct ggcagcaccacggggtgacc ccgcaagccg tcgtcggcca ctcgcagggc 23340 gagatcgccg ccgcgtacgtcgccggtgcc ctgagcctgg acgacgccgc tcgtgtcgtg 23400 accctgcgca gcaagtccatcggcgcccac ctcgcgggcc agggcggcat gctgtccctc 23460 gcgctgagcg aggcggccgttgtggagcga ctggccgggt tcgacgggct gtccgtcgcc 23520 gccgtcaacg ggcctaccgccaccgtggtt tcgggcgacc cgacccagat ccaagagctc 23580 gctcaggcgt gtgaggccgacggggtccgc gcacggatca tccccgtcga ctacgcctcc 23640 cacagcgccc acgtcgagaccatcgagagc gaactcgccg acgtcctggc ggggttgtcc 23700 ccccagacac cccaggtccccttcttctcc accctcgaag gcgcctggat caccgaaccc 23760 gccctcgacg gcggctactggtaccgcaac ctccgccatc gtgtgggctt cgccccggcc 23820 gtcgaaaccc tggccaccgacgaaggcttc acccacttcg tcgaggtcag cgcccacccc 23880 gtcctcacca tggcgctgcccgagaccgtc accggactcg gcaccctccg ccgtgacaac 23940 ggcggacagc accgcctcaccacctccctc gccgaggcct gggccaacgg cctcaccgtc 24000 gactgggcct ctctcctccccaccacgacc acccaccccg atctgcccac ctacgccttc 24060 cagaccgagc gctactggccgcagcccgac ctctccgccg ccggtgacat cacctccgcc 24120 ggtctcgggg cggccgagcacccgctgctc ggcgcggccg tggcgctcgc ggactccgac 24180 ggctgcctgc tcacggggagcctctccctc cgtacgcacc cctggctggc ggaccacgcg 24240 gtggccggca ccgtgctgctgccgggaacg gcgttcgtgg agctggcgtt ccgagccggg 24300 gaccaggtcg gttgcgatctggtcgaggag ctcaccctcg acgcgccgct cgtgctgccc 24360 cgtcgtggcg cggtccgtgtgcagctgtcc gtcggcgcga gcgacgagtc cgggcgtcgt 24420 accttcgggc tctacgcgcacccggaggac gcgccgggcg aggcggagtg gacgcggcac 24480 gccaccggtg tgctggccgcccgtgcggac cgcaccgccc ccgtcgccga cccggaggcc 24540 tggccgccgc cgggcgccgagccggtggac gtggacggtc tgtacgagcg cttcgcggcg 24600 aacggctacg gctacggccccctcttccag ggcgtccgtg gtgtctggcg gcgtggcgac 24660 gaggtgttcg ccgacgtggccctgccggcc gaggtcgccg gtgccgaggg cgcgcggttc 24720 ggccttcacc cggcgctgctcgacgccgcc gtgcaggcgg ccggtgcggg ccggggcgtt 24780 cggcgcgggc acgcggctgccgttcgcctg gagcgggatc tcctgtacgc ggtcggcgcc 24840 accgccctcc gcgtgcggctggcccccgcc ggcccggaca cggtgtccgt gagcgccgcc 24900 gactcctccg ggcagccggtgttcgccgcg gactccctca cggtgctgcc cgtcgacccc 24960 gcgcagctgg cggccttcagcgacccgact ctggacgcgc tgcacctgct ggagtggacc 25020 gcctgggacg gtgccgcgcaggccctgccc ggcgcggtcg tgctgggcgg cgacgccgac 25080 ggtctcgccg cggcgctgcgcgccggtggc accgaggtcc tgtccttccc ggaccttacg 25140 gacctggtgg aggccgtcgaccggggcgag accccggccc cggcgaccgt cctggtggcc 25200 tgccccgccg ccggccccgatgggccggag catgtccgcg aggccctgca cgggtcgctc 25260 gcgctgatgc aggcctggctggccgacgag cggttcaccg atgggcgcct ggtgctcgtg 25320 acccgcgacg cggtcgccgcccgttccggc gacggcctgc ggtccacggg acaggccgcc 25380 gtctggggcc tcggccggtccgcgcagacg gagagcccgg gccggttcgt cctgctcgac 25440 ctcgccgggg aagcccggacggccggggac gccaccgccg gggacggcct gacgaccggg 25500 gacgccaccg tcggcggcacctctggagac gccgccctcg gcagcgccct cgcgaccgcc 25560 ctcggctcgg gcgagccgcagctcgccctc cgggacgggg cgctcctcgt accccgcctg 25620 gcgcgggccg ccgcgcccgccgcggccgac ggcctcgccg cggccgacgg cctcgccgct 25680 ctgccgctgc ccgccgctccggccctctgg cgtctggagc ccggtacgga cggcagcctg 25740 gagagcctca cggcggcgcccggcgacgcc gagaccctcg ccccggagcc gctcggcccg 25800 ggacaggtcc gcatcgcgatccgggccacc ggtctcaact tccgcgacgt cctgatcgcc 25860 ctcggcatgt accccgatccggcgctgatg ggcaccgagg gagccggcgt ggtcaccgcg 25920 accggccccg gcgtcacgcacctcgccccc ggcgaccggg tcatgggcct gctctccggc 25980 gcgtacgccc cggtcgtcgtggcggacgcg cggaccgtcg cgcggatgcc cgaggggtgg 26040 acgttcgccc agggcgcctccgtgccggtg gtgttcctga cggccgtcta cgccctgcgc 26100 gacctggcgg acgtcaagcccggcgagcgc ctcctggtcc actccgccgc cggtggcgtg 26160 ggcatggccg ccgtgcagctcgcccggcac tggggcgtgg aggtccacgg cacggcgagt 26220 cacgggaagt gggacgccctgcgcgcgctc ggcctggacg acgcgcacat cgcctcctcc 26280 cgcaccctgg acttcgagtccgcgttccgt gccgcttccg gcggggcggg catggacgtc 26340 gtactgaact cgctcgcccgcgagttcgtc gacgcctcgc tgcgcctgct cgggccgggc 26400 ggccggttcg tggagatggggaagaccgac gtccgcgacg cggagcgggt cgccgccgac 26460 caccccggtg tcggctaccgcgccttcgac ctgggcgagg ccgggccgga gcggatcggc 26520 gagatgctcg ccgaggtcatcgccctcttc gaggacgggg tgctccggca cctgcccgtc 26580 acgacctggg acgtgcgccgggcccgcgac gccttccggc acgtcagcca ggcccgccac 26640 acgggcaagg tcgtcctcacgatgccgtcg ggcctcgacc cggagggtac ggtcctgctg 26700 accggcggca ccggtgcgctggggggcatc gtggcccggc acgtggtggg cgagtggggc 26760 gtacgacgcc tgctgctcgtgagccggcgg ggcacggacg ccccgggcgc cggcgagctc 26820 gtgcacgagc tggaggccctgggagccgac gtctcggtgg ccgcgtgcga cgtcgccgac 26880 cgcgaagccc tcaccgccgtactcgactcg atccccgccg aacacccgct caccgcggtc 26940 gtccacacgg caggcgtcctctccgacggc accctcccct cgatgacagc ggaggatgtg 27000 gaacacgtac tgcgtcccaaggtcgacgcc gcgttcctcc tcgacgaact cacctcgacg 27060 cccggctacg acctggcagcgttcgtcatg ttctcctccg ccgccgccgt cttcggtggc 27120 gcggggcagg gcgcctacgccgccgccaac gccaccctcg acgccctcgc ctggcgccgc 27180 cggacagccg gactccccgccctctccctc ggctggggcc tctgggccga gaccagcggc 27240 atgaccggcg gactcagcgacaccgaccgc tcgcggctgg cccgttccgg ggcgacgccc 27300 atggacagcg agctgaccctgtccctcctg gacgcggcca tgcgccgcga cgacccggcg 27360 ctcgtcccga tcgccctggacgtcgccgcg ctccgcgccc agcagcgcga cggcatgctg 27420 gcgccgctgc tcagcgggctcacccgcgga tcgcgggtcg gcggcgcgcc ggtcaaccag 27480 cgcagggcag ccgccggaggcgcgggcgag gcggacacgg acctcggcgg gcggctcgcc 27540 gcgatgacac cggacgaccgggtcgcgcac ctgcgggacc tcgtccgtac gcacgtggcg 27600 accgtcctgg gacacggcaccccgagccgg gtggacctgg agcgggcctt ccgcgacacc 27660 ggtttcgact cgctcaccgccgtcgaactc cgcaaccgtc tcaacgccgc gaccgggctg 27720 cggctgccgg ccacgctggtcttcgaccac cccaccccgg gggagctcgc cgggcacctg 27780 ctcgacgaac tcgccacggccgcgggcggg tcctgggcgg aaggcaccgg gtccggagac 27840 acggcctcgg cgaccgatcggcagaccacg gcggccctcg ccgaactcga ccggctggaa 27900 ggcgtgctcg cctccctcgcgcccgccgcc ggcggccgtc cggagctcgc cgcccggctc 27960 agggcgctgg ccgcggccctgggggacgac ggcgacgacg ccaccgacct ggacgaggcg 28020 tccgacgacg acctcttctccttcatcgac aaggagctgg gcgactccga cttctgacct 28080 gcccgacacc accggcaccaccggcaccac cagcccccct cacacacgga acacggaacg 28140 gacaggcgag aacgggagccatggcgaaca acgaagacaa gctccgcgac tacctcaagc 28200 gcgtcaccgc cgagctgcagcagaacacca ggcgtctgcg cgagatcgag ggacgcacgc 28260 acgagccggt ggcgatcgtgggcatggcct gccgcctgcc gggcggtgtc gcctcgcccg 28320 aggacctgtg gcagctggtggccggggacg gggacgcgat ctcggagttc ccgcaggacc 28380 gcggctggga cgtggaggggctgtacgacc ccgacccgga cgcgtccggc aggacgtact 28440 gccggtccgg cggattcctgcacgacgccg gcgagttcga cgccgacttc ttcgggatct 28500 cgccgcgcga ggccctcgccatggacccgc agcagcgact gtccctcacc accgcgtggg 28560 aggcgatcga gagcgcgggcatcgacccga cggccctgaa gggcagcggc ctcggcgtct 28620 tcgtcggcgg ctggcacaccggctacacct cggggcagac caccgccgtg cagtcgcccg 28680 agctggaggg ccacctggtcagcggcgcgg cgctgggctt cctgtccggc cgtatcgcgt 28740 acgtcctcgg tacggacggaccggccctga ccgtggacac ggcctgctcg tcctcgctgg 28800 tcgccctgca cctcgccgtgcaggccctcc gcaagggcga gtgcgacatg gccctcgccg 28860 gtggtgtcac ggtcatgcccaacgcggacc tgttcgtgca gttcagccgg cagcgcgggc 28920 tggccgcgga cggccggtcgaaggcgttcg ccacctcggc ggacggcttc ggccccgcgg 28980 agggcgccgg agtcctgctggtggagcgcc tgtcggacgc ccgccgcaac ggacaccgga 29040 tcctcgcggt cgtccgcggcagcgcggtca accaggacgg cgccagcaac ggcctcacgg 29100 ctccgcacgg gccctcccagcagcgcgtca tccgacgggc cctggcggac gcccggctcg 29160 cgccgggtga cgtggacgtcgtcgaggcgc acggcacggg cacgcggctc ggcgacccga 29220 tcgaggcgca ggccctcatcgccacctacg gccaggagaa gagcagcgaa cagccgctga 29280 ggctgggcgc gttgaagtcgaacatcgggc acacgcaggc cgcggccggt gtcgcaggtg 29340 tcatcaagat ggtccaggcgatgcgccacg gactgctgcc gaagacgctg cacgtcgacg 29400 agccctcgga ccagatcgactggtcggcgg gcacggtgga actcctcacc gaggccgtcg 29460 actggccgga gaagcaggacggcgggctgc gccgcgcggc tgtctcctcc ttcggcatca 29520 gcgggacgaa cgcgcacgtcgtcctggagg aggccccggc ggtcgaggac tccccggccg 29580 tcgagccgcc ggccggtggcggtgtggtgc cgtggccggt gtccgcgaag actccggccg 29640 cgctggacgc ccagatcgggcagctcgccg cgtacgcgga cggtcgtacg gacgtggatc 29700 cggcggtggc cgcccgcgccctggtcgaca gccgtacggc gatggagcac cgcgcggtcg 29760 cggtcggcga cagccgggaggcactgcggg acgccctgcg gatgccggaa ggactggtac 29820 gcggcacgtc ctcggacgtgggccgggtgg cgttcgtctt ccccggccag ggcacgcagt 29880 gggccggcat gggcgccgaactccttgaca gctcaccgga gttcgctgcc tcgatggccg 29940 aatgcgagac cgcgctctcccgctacgtcg actggtctct tgaagccgtc gtccgacagg 30000 aacccggcgc acccacgctcgaccgcgtcg acgtcgtcca gcccgtgacc ttcgctgtca 30060 tggtctcgct ggcgaaggtctggcagcacc acggcatcac cccccaggcc gtcgtcggcc 30120 actcgcaggg cgagatcgccgccgcgtacg tcgccggtgc actcaccctc gacgacgccg 30180 cccgcgtcgt caccctgcgcagcaagtcca tcgccgccca cctcgccggc aagggcggca 30240 tgatctccct cgccctcgacgaggcggccg tcctgaagcg actgagcgac ttcgacggac 30300 tctccgtcgc cgccgtcaacggccccaccg ccaccgtcgt ctccggcgac ccgacccaga 30360 tcgaggaact cgcccgcacctgcgaggccg acggcgtccg tgcgcggatc atcccggtcg 30420 actacgcctc ccacagccggcaggtcgaga tcatcgagaa ggagctggcc gaggtcctcg 30480 ccggactcgc cccgcaggctccgcacgtgc cgttcttctc caccctcgaa ggcacctgga 30540 tcaccgagcc ggtgctcgacggcacctact ggtaccgcaa cctgcgccat cgcgtgggct 30600 tcgcccccgc cgtggagaccttggcggttg acggcttcac ccacttcatc gaggtcagcg 30660 cccaccccgt cctcaccatgaccctccccg agaccgtcac cggcctcggc accctccgcc 30720 gcgaacaggg aggccaggagcgtctggtca cctcactcgc cgaagcctgg gccaacggcc 30780 tcaccatcga ctgggcgcccatcctcccca ccgcaaccgg ccaccacccc gagctcccca 30840 cctacgcctt ccagaccgagcgcttctggc tgcagagctc cgcgcccacc agcgccgccg 30900 acgactggcg ttaccgcgtcgagtggaagc cgctgacggc ctccggccag gcggacctgt 30960 ccgggcggtg gatcgtcgccgtcgggagcg agccagaagc cgagctgctg ggcgcgctga 31020 aggccgcggg agcggaggtcgacgtactgg aagccggggc ggacgacgac cgtgaggccc 31080 tcgccgcccg gctcaccgcactgacgaccg gcgacggctt caccggcgtg gtctcgctcc 31140 tcgacgacct cgtgccacaggtcgcctggg tgcaggcact cggcgacgcc ggaatcaagg 31200 cgcccctgtg gtccgtcacccagggcgcgg tctccgtcgg acgtctcgac acccccgccg 31260 accccgaccg ggccatgctctggggcctcg gccgcgtcgt cgcccttgag caccccgaac 31320 gctgggccgg cctcgtcgacctccccgccc agcccgatgc cgccgccctc gcccacctcg 31380 tcaccgcact ctccggcgccaccggcgagg accagatcgc catccgcacc accggactcc 31440 acgcccgccg cctcgcccgcgcacccctcc acggacgtcg gcccacccgc gactggcagc 31500 cccacggcac cgtcctcatcaccggcggca ccggagccct cggcagccac gccgcacgct 31560 ggatggccca ccacggagccgaacacctcc tcctcgtcag ccgcagcggc gaacaagccc 31620 ccggagccac ccaactcaccgccgaactca ccgcatcggg cgcccgcgtc accatcgccg 31680 cctgcgacgt cgccgacccccacgccatgc gcaccctcct cgacgccatc cccgccgaga 31740 cgcccctcac cgccgtcgtccacaccgccg gcgcaccggg cggcgatccg ctggacgtca 31800 ccggcccgga ggacatcgcccgcatcctgg gcgcgaagac gagcggcgcc gaggtcctcg 31860 acgacctgct ccgcggcactccgctggacg ccttcgtcct ctactcctcg aacgccgggg 31920 tctggggcag cggcagccagggcgtctacg cggcggccaa cgcccacctc gacgcgctcg 31980 ccgcccggcg ccgcgcccggggcgagacgg cgacctcggt cgcctggggc ctctgggccg 32040 gcgacggcat gggccggggcgccgacgacg cgtactggca gcgtcgcggc atccgtccga 32100 tgagccccga ccgcgccctggacgaactgg ccaaggccct gagccacgac gagaccttcg 32160 tcgccgtggc cgatgtcgactgggagcggt tcgcgcccgc gttcacggtg tcccgtccca 32220 gccttctgct cgacggcgtcccggaggccc ggcaggcgct cgccgcaccc gtcggtgccc 32280 cggctcccgg cgacgccgccgtggcgccga ccgggcagtc gtcggcgctg gccgcgatca 32340 ccgcgctccc cgagcccgagcgccggccgg cgctcctcac cctcgtccgt acccacgcgg 32400 cggccgtact cggccattcctcccccgacc gggtggcccc cggccgtgcc ttcaccgagc 32460 tcggcttcga ctcgctgacggccgtgcagc tccgcaacca gctctccacg gtggtcggca 32520 acaggctccc cgccaccacggtcttcgacc acccgacgcc cgccgcactc gccgcgcacc 32580 tccacgaggc gtacctcgcaccggccgagc cggccccgac ggactgggag gggcgggtgc 32640 gccgggccct ggccgaactgcccctcgacc ggctgcggga cgcgggggtc ctcgacaccg 32700 tcctgcgcct caccggcatcgagcccgagc cgggttccgg cggttcggac ggcggcgccg 32760 ccgaccctgg tgcggagccggaggcgtcga tcgacgacct ggacgccgag gccctgatcc 32820 ggatggctct cggcccccgtaacacctgac ccgaccgcgg tcctgcccca cgcgccgcac 32880 cccgcgcatc ccgcgcaccacccgccccca cacgcccaca accccatcca cgagcggaag 32940 accacaccca gatgacgagttccaacgaac agttggtgga cgctctgcgc gcctctctca 33000 aggagaacga agaactccggaaagagagcc gtcgccgggc cgaccgtcgg caggagccca 33060 tggcgatcgt cggcatgagctgccggttcg cgggcggaat ccggtccccc gaggacctct 33120 gggacgccgt cgccgcgggcaaggacctgg tctccgaggt accggaggag cgcggctggg 33180 acatcgactc cctctacgacccggtgcccg ggcgcaaggg cacgacgtac gtccgcaacg 33240 ccgcgttcct cgacgacgccgccggattcg acgcggcctt cttcgggatc tcgccgcgcg 33300 aggccctcgc catggacccgcagcagcggc agctcctcga agcctcctgg gaggtcttcg 33360 agcgggccgg catcgaccccgcgtcggtcc gcggcaccga cgtcggcgtg tacgtgggct 33420 gtggctacca ggactacgcgccggacatcc gggtcgcccc cgaaggcacc ggcggttacg 33480 tcgtcaccgg caactcctccgccgtggcct ccgggcgcat cgcgtactcc ctcggcctgg 33540 agggacccgc cgtgaccgtggacacggcgt gctcctcttc gctcgtcgcc ctgcacctcg 33600 ccctgaaggg cctgcggaacggcgactgct cgacggcact cgtgggcggc gtggccgtcc 33660 tcgcgacgcc gggcgcgttcatcgagttca gcagccagca ggccatggcc gccgacggcc 33720 ggaccaaggg cttcgcctcggcggcggacg gcctcgcctg gggcgagggc gtcgccgtac 33780 tcctcctcga acggctctccgacgcgcggc gcaagggcca ccgggtcctg gccgtcgtgc 33840 gcggcagcgc catcaaccaggacggcgcga gcaacggcct cacggctccg cacgggccct 33900 cccagcagca cctgatccgccaggccctgg ccgacgcgcg gctcacgtcg agcgacgtgg 33960 acgtcgtgga gggccacggcacggggaccc gtctcggcga cccgatcgag gcgcaggcgc 34020 tgctcgccac gtacgggcaggggcgcgccc cggggcagcc gctgcggctg gggacgctga 34080 agtcgaacat cgggcacacgcaggccgctt cgggtgtcgc cggtgtcatc aagatggtgc 34140 aggcgctgcg ccacggggtgctgccgaaga ccctgcacgt ggacgagccg acggaccagg 34200 tcgactggtc ggccggttcggtcgagctgc tcaccgaggc cgtggactgg ccggagcggc 34260 cgggccggct ccgccgggcgggcgtctccg cgttcggcgt gggcgggacg aacgcgcacg 34320 tcgtcctgga ggaggccccggcggtcgagg agtcccctgc cgtcgagccg ccggccggtg 34380 gcggcgtggt gccgtggccggtgtccgcga agacctcggc cgcactggac gcccagatcg 34440 ggcagctcgc cgcatacgcggaagaccgca cggacgtgga tccggcggtg gccgcccgcg 34500 ccctggtcga cagccgtacggcgatggagc accgcgcggt cgcggtcggc gacagccggg 34560 aggcactgcg ggacgccctgcggatgccgg aaggactggt acggggcacg gtcaccgatc 34620 cgggccgggt ggcgttcgtcttccccggcc agggcacgca gtgggccggc atgggcgccg 34680 aactcctcga cagctcacccgaattcgccg ccgccatggc cgaatgcgag accgcactct 34740 ccccgtacgt cgactggtctctcgaagccg tcgtccgaca ggctcccagc gcaccgacac 34800 tcgaccgcgt cgacgtcgtccagcccgtca ccttcgccgt catggtctcc ctcgccaagg 34860 tctggcagca ccacggcatcacccccgagg ccgtcatcgg ccactcccag ggcgagatcg 34920 ccgccgcgta cgtcgccggtgccctcaccc tcgacgacgc cgctcgtgtc gtgaccctcc 34980 gcagcaagtc catcgccgcccacctcgccg gcaagggcgg catgatctcc ctcgccctca 35040 gcgaggaagc cacccggcagcgcatcgaga acctccacgg actgtcgatc gccgccgtca 35100 acgggcctac cgccaccgtggtttcgggcg accccaccca gatccaagaa cttgctcagg 35160 cgtgtgaggc cgacggcatccgcgcacgga tcatccccgt cgactacgcc tcccacagcg 35220 cccacgtcga gaccatcgagaacgaactcg ccgacgtcct ggcggggttg tccccccaga 35280 caccccaggt ccccttcttctccaccctcg aaggcacctg gatcaccgaa cccgccctcg 35340 acggcggcta ctggtaccgcaacctccgcc atcgtgtggg cttcgccccg gccgtcgaga 35400 ccctcgccac cgacgaaggcttcacccact tcatcgaggt cagcgcccac cccgtcctca 35460 ccatgaccct ccccgacaaggtcaccggcc tggccaccct ccgacgcgag gacggcggac 35520 agcaccgcct caccacctcccttgccgagg cctgggccaa cggcctcgcc ctcgactggg 35580 cctccctcct gcccgccacgggcgccctca gccccgccgt ccccgacctc ccgacgtacg 35640 ccttccagca ccgctcgtactggatcagcc ccgcgggtcc cggcgaggcg cccgcgcaca 35700 ccgcttccgg gcgcgaggccgtcgccgaga cggggctcgc gtggggcccg ggtgccgagg 35760 acctcgacga ggagggccggcgcagcgccg tactcgcgat ggtgatgcgg caggcggcct 35820 ccgtgctccg gtgcgactcgcccgaagagg tccccgtcga ccgcccgctg cgggagatcg 35880 gcttcgactc gctgaccgccgtcgacttcc gcaaccgcgt caaccggctg accggtctcc 35940 agctgccgcc caccgtcgtgttccagcacc cgacgcccgt cgcgctcgcc gagcgcatca 36000 gcgacgagct ggccgagcggaactgggccg tcgccgagcc gtcggatcac gagcaggcgg 36060 aggaggagaa ggccgccgctccggcggggg cccgctccgg ggccgacacc ggcgccggcg 36120 ccgggatgtt ccgcgccctgttccggcagg ccgtggagga cgaccggtac ggcgagttcc 36180 tcgacgtcct cgccgaagcctccgcgttcc gcccgcagtt cgcctcgccc gaggcctgct 36240 cggagcggct cgacccggtgctgctcgccg gcggtccgac ggaccgggcg gaaggccgtg 36300 ccgttctcgt cggctgcaccggcaccgcgg cgaacggcgg cccgcacgag ttcctgcggc 36360 tcagcacctc cttccaggaggagcgggact tcctcgccgt acctctcccc ggctacggca 36420 cgggtacggg caccggcacggccctcctcc cggccgatct cgacaccgcg ctcgacgccc 36480 aggcccgggc gatcctccgggccgccgggg acgccccggt cgtcctgctc gggcactccg 36540 gcggcgccct gctcgcgcacgagctggcct tccgcctgga gcgggcgcac ggcgcgccgc 36600 cggccgggat cgtcctggtcgacccctatc cgccgggcca tcaggagccc atcgaggtgt 36660 ggagcaggca gctgggcgagggcctgttcg cgggcgagct ggagccgatg tccgatgcgc 36720 ggctgctggc catgggccggtacgcgcggt tcctcgccgg cccgcggccg ggccgcagca 36780 gcgcgcccgt gcttctggtccgtgcctccg aaccgctggg cgactggcag gaggagcggg 36840 gcgactggcg tgcccactgggaccttccgc acaccgtcgc ggacgtgccg ggcgaccact 36900 tcacgatgat gcgggaccacgcgccggccg tcgccgaggc cgtcctctcc tggctcgacg 36960 ccatcgaggg catcgagggggcgggcaagt gaccgacaga cctctgaacg tggacagcgg 37020 actgtggatc cggcgcttccaccccgcgcc gaacagcgcg gtgcggctgg tctgcctgcc 37080 gcacgccggc ggctccgccagctacttctt ccgcttctcg gaggagctgc acccctccgt 37140 cgaggccctg tcggtgcagtatccgggccg ccaggaccgg cgtgccgagc cgtgtctgga 37200 gagcgtcgag gagctcgccgagcatgtggt cgcggccacc gaaccctggt ggcaggaggg 37260 ccggctggcc ttcttcgggcacagcctcgg cgcctccgtc gccttcgaga cggcccgcat 37320 cctggaacag cggcacggggtacggcccga gggcctgtac gtctccggtc ggcgcgcccc 37380 gtcgctggcg ccggaccggctcgtccacca gctggacgac cgggcgttcc tggccgagat 37440 ccggcggctc agcggcaccgacgagcggtt cctccaggac gacgagctgc tgcggctggt 37500 gctgcccgcg ctgcgcagcgactacaaggc ggcggagacg tacctgcacc ggccgtccgc 37560 caagctcacc tgcccggtgatggccctggc cggcgaccgt gacccgaagg cgccgctgaa 37620 cgaggtggcc gagtggcgtcggcacaccag cgggccgttc tgcctccggg cgtactccgg 37680 cggccacttc tacctcaacgaccagtggca cgagatctgc aacgacatct ccgaccacct 37740 gctcgtcacc cgcggcgcgcccgatgcccg cgtcgtgcag cccccgacca gccttatcga 37800 aggagcggcg aagagatggcagaacccacg gtgaccgacg acctgacggg ggccctcacg 37860 cagcccccgc tgggccgcaccgtccgcgcg gtggccgacc gtgaactcgg cacccacctc 37920 ctggagaccc gcggcatccactggatcc 37948 6 12199 PRT Streptomyces venezuelae 6 Met Ala Phe Ser ProGln Gly Gly Arg His Glu Leu Gly Gln Asn Phe 1 5 10 15 Leu Val Asp ArgSer Val Ile Asp Glu Ile Asp Gly Leu Val Ala Arg 20 25 30 Thr Lys Gly ProIle Leu Glu Ile Gly Pro Gly Asp Gly Ala Leu Thr 35 40 45 Leu Pro Leu SerArg His Gly Arg Pro Ile Thr Ala Val Glu Leu Asp 50 55 60 Gly Arg Arg AlaGln Arg Leu Gly Ala Arg Thr Pro Gly His Val Thr 65 70 75 80 Val Val HisHis Asp Phe Leu Gln Tyr Pro Leu Pro Arg Asn Pro His 85 90 95 Val Val ValGly Asn Val Pro Phe His Leu Thr Thr Ala Ile Met Arg 100 105 110 Arg LeuLeu Asp Ala Gln His Trp His Thr Ala Val Leu Leu Val Gln 115 120 125 TrpGlu Val Ala Arg Arg Arg Ala Gly Val Gly Gly Ser Thr Leu Leu 130 135 140Thr Ala Gly Trp Ala Pro Trp Tyr Glu Phe Asp Leu His Ser Arg Val 145 150155 160 Pro Ala Arg Ala Phe Arg Pro Met Pro Gly Val Asp Gly Gly Val Leu165 170 175 Ala Ile Arg Arg Arg Ser Ala Pro Leu Val Gly Gln Val Lys ThrTyr 180 185 190 Gln Asp Phe Val Arg Gln Val Phe Thr Gly Lys Gly Asn GlyLeu Lys 195 200 205 Glu Ile Leu Arg Arg Thr Gly Arg Ile Ser Gln Arg AspLeu Ala Thr 210 215 220 Trp Leu Arg Arg Asn Glu Ile Ser Pro His Ala LeuPro Lys Asp Leu 225 230 235 240 Lys Pro Gly Gln Trp Ala Ser Leu Trp GluLeu Thr Gly Gly Thr Ala 245 250 255 Asp Gly Ser Phe Asp Gly Thr Ala GlyGly Gly Ala Ala Gly Ser His 260 265 270 Gly Ala Ala Arg Val Gly Ala GlyHis Pro Gly Gly Arg Val Ser Ala 275 280 285 Ser Arg Arg Gly Val Pro GlnAla Arg Arg Gly Arg Gly His Ala Val 290 295 300 Arg Ser Ser Thr Gly ThrGlu Pro Arg Trp Gly Arg Gly Arg Ala Glu 305 310 315 320 Ser Ala Met AlaMet Arg Asp Ser Ile Pro Arg Arg Ala Asp Arg Asp 325 330 335 Thr Leu ArgArg Glu Leu Gly Gln Asn Phe Leu Gln Asp Asp Arg Ala 340 345 350 Val ArgAsn Leu Val Thr His Val Glu Gly Asp Gly Arg Asn Val Leu 355 360 365 GluIle Gly Pro Gly Lys Gly Ala Ile Thr Glu Glu Leu Val Arg Ser 370 375 380Phe Asp Thr Val Thr Val Val Glu Met Asp Pro His Trp Ala Ala His 385 390395 400 Val Arg Arg Lys Phe Glu Gly Glu Arg Val Thr Val Phe Gln Gly Asp405 410 415 Phe Leu Asp Phe Arg Ile Pro Arg Asp Ile Asp Thr Val Val GlyAsn 420 425 430 Val Pro Phe Gly Ile Thr Thr Gln Ile Leu Arg Ser Leu LeuGlu Ser 435 440 445 Thr Asn Trp Gln Ser Ala Ala Leu Ile Val Gln Trp GluVal Ala Arg 450 455 460 Lys Arg Ala Gly Arg Ser Gly Gly Ser Leu Leu ThrThr Ser Trp Ala 465 470 475 480 Pro Trp Tyr Glu Phe Ala Val His Asp ArgVal Arg Ala Ser Ser Phe 485 490 495 Arg Pro Met Pro Arg Val Asp Gly GlyVal Leu Thr Ile Arg Arg Arg 500 505 510 Pro Gln Pro Leu Leu Pro Glu SerAla Ser Arg Ala Phe Gln Asn Phe 515 520 525 Ala Glu Ala Val Phe Thr GlyPro Gly Arg Gly Leu Ala Glu Ile Leu 530 535 540 Arg Arg His Ile Pro LysArg Thr Tyr Arg Ser Leu Ala Asp Arg His 545 550 555 560 Gly Ile Pro AspGly Gly Leu Pro Lys Asp Leu Thr Leu Thr Gln Trp 565 570 575 Ile Ala LeuPhe Gln Ala Ser Gln Pro Ser Tyr Ala Pro Gly Ala Pro 580 585 590 Gly ThrArg Met Pro Gly Gln Gly Gly Gly Ala Gly Gly Arg Asp Tyr 595 600 605 AspSer Glu Thr Ser Arg Ala Ala Val Pro Gly Ser Arg Arg Tyr Gly 610 615 620Pro Thr Arg Gly Gly Glu Pro Cys Ala Pro Arg Ala Gln Val Arg Gln 625 630635 640 Thr Lys Gly Arg Gln Gly Ala Arg Gly Ser Ser Tyr Gly Arg Arg Thr645 650 655 Gly Arg Met Ser Ser Ala Gly Ile Thr Arg Thr Gly Ala Arg ThrPro 660 665 670 Val Thr Gly Arg Gly Ala Ala Ala Trp Asp Thr Gly Glu ValArg Val 675 680 685 Arg Arg Gly Leu Pro Pro Ala Gly Pro Asp His Ala GluHis Ser Phe 690 695 700 Ser Arg Ala Pro Thr Gly Asp Val Arg Ala Glu LeuIle Arg Gly Glu 705 710 715 720 Met Ser Thr Val Ser Lys Ser Glu Ser GluGlu Phe Val Ser Val Ser 725 730 735 Asn Asp Ala Gly Ser Ala His Gly ThrAla Glu Pro Val Ala Val Val 740 745 750 Gly Ile Ser Cys Arg Val Pro GlyAla Arg Asp Pro Arg Glu Phe Trp 755 760 765 Glu Leu Leu Ala Ala Gly GlyGln Ala Val Thr Asp Val Pro Ala Asp 770 775 780 Arg Trp Asn Ala Gly AspPhe Tyr Asp Pro Asp Arg Ser Ala Pro Gly 785 790 795 800 Arg Ser Asn SerArg Trp Gly Gly Phe Ile Glu Asp Val Asp Arg Phe 805 810 815 Asp Ala AlaPhe Phe Gly Ile Ser Pro Arg Glu Ala Ala Glu Met Asp 820 825 830 Pro GlnGln Arg Leu Ala Leu Glu Leu Gly Trp Glu Ala Leu Glu Arg 835 840 845 AlaGly Ile Asp Pro Ser Ser Leu Thr Gly Thr Arg Thr Gly Val Phe 850 855 860Ala Gly Ala Ile Trp Asp Asp Tyr Ala Thr Leu Lys His Arg Gln Gly 865 870875 880 Gly Ala Ala Ile Thr Pro His Thr Val Thr Gly Leu His Arg Gly Ile885 890 895 Ile Ala Asn Arg Leu Ser Tyr Thr Leu Gly Leu Arg Gly Pro SerMet 900 905 910 Val Val Asp Ser Gly Gln Ser Ser Ser Leu Val Ala Val HisLeu Ala 915 920 925 Cys Glu Ser Leu Arg Arg Gly Glu Ser Glu Leu Ala LeuAla Gly Gly 930 935 940 Val Ser Leu Asn Leu Val Pro Asp Ser Ile Ile GlyAla Ser Lys Phe 945 950 955 960 Gly Gly Leu Ser Pro Asp Gly Arg Ala TyrThr Phe Asp Ala Arg Ala 965 970 975 Asn Gly Tyr Val Arg Gly Glu Gly GlyGly Phe Val Val Leu Lys Arg 980 985 990 Leu Ser Arg Ala Val Ala Asp GlyAsp Pro Val Leu Ala Val Ile Arg 995 1000 1005 Gly Ser Ala Val Asn AsnGly Gly Ala Ala Gln Gly Met Thr Thr Pro 1010 1015 1020 Asp Ala Gln AlaGln Glu Ala Val Leu Arg Glu Ala His Glu Arg Ala 1025 1030 1035 1040 GlyThr Ala Pro Ala Asp Val Arg Tyr Val Glu Leu His Gly Thr Gly 1045 10501055 Thr Pro Val Gly Asp Pro Ile Glu Ala Ala Ala Leu Gly Ala Ala Leu1060 1065 1070 Gly Thr Gly Arg Pro Ala Gly Gln Pro Leu Leu Val Gly SerVal Lys 1075 1080 1085 Thr Asn Ile Gly His Leu Glu Gly Ala Ala Gly IleAla Gly Leu Ile 1090 1095 1100 Lys Ala Val Leu Ala Val Arg Gly Arg AlaLeu Pro Ala Ser Leu Asn 1105 1110 1115 1120 Tyr Glu Thr Pro Asn Pro AlaIle Pro Phe Glu Glu Leu Asn Leu Arg 1125 1130 1135 Val Asn Thr Glu TyrLeu Pro Trp Glu Pro Glu His Asp Gly Gln Arg 1140 1145 1150 Met Val ValGly Val Ser Ser Phe Gly Met Gly Gly Thr Asn Ala His 1155 1160 1165 ValVal Leu Glu Glu Ala Pro Gly Gly Cys Arg Gly Ala Ser Val Val 1170 11751180 Glu Ser Thr Val Gly Gly Ser Ala Val Gly Gly Gly Val Val Pro Trp1185 1190 1195 1200 Val Val Ser Ala Lys Ser Ala Ala Ala Leu Asp Ala GlnIle Glu Arg 1205 1210 1215 Leu Ala Ala Phe Ala Ser Arg Asp Arg Thr AspGly Val Asp Ala Gly 1220 1225 1230 Ala Val Asp Ala Gly Ala Val Asp AlaGly Ala Val Ala Arg Val Leu 1235 1240 1245 Ala Gly Gly Arg Ala Gln PheGlu His Arg Ala Val Val Val Gly Ser 1250 1255 1260 Gly Pro Asp Asp LeuAla Ala Ala Leu Ala Ala Pro Glu Gly Leu Val 1265 1270 1275 1280 Arg GlyVal Ala Ser Gly Val Gly Arg Val Ala Phe Val Phe Pro Gly 1285 1290 1295Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser 13001305 1310 Ala Val Phe Ala Ala Ala Met Ala Glu Cys Glu Ala Ala Leu SerPro 1315 1320 1325 Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln AlaPro Gly Ala 1330 1335 1340 Pro Thr Leu Glu Arg Val Asp Val Val Gln ProVal Thr Phe Ala Val 1345 1350 1355 1360 Met Val Ser Leu Ala Arg Val TrpGln His His Gly Val Thr Pro Gln 1365 1370 1375 Ala Val Val Gly His SerGln Gly Glu Ile Ala Ala Ala Tyr Val Ala 1380 1385 1390 Gly Ala Leu SerLeu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser 1395 1400 1405 Lys SerIle Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser Leu 1410 1415 1420Ala Leu Ser Glu Asp Ala Val Leu Glu Arg Leu Ala Gly Phe Asp Gly 14251430 1435 1440 Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val ValSer Gly 1445 1450 1455 Asp Pro Val Gln Ile Glu Glu Leu Ala Arg Ala CysGlu Ala Asp Gly 1460 1465 1470 Val Arg Ala Arg Val Ile Pro Val Asp TyrAla Ser His Ser Arg Gln 1475 1480 1485 Val Glu Ile Ile Glu Ser Glu LeuAla Glu Val Leu Ala Gly Leu Ser 1490 1495 1500 Pro Gln Ala Pro Arg ValPro Phe Phe Ser Thr Leu Glu Gly Ala Trp 1505 1510 1515 1520 Ile Thr GluPro Val Leu Asp Gly Gly Tyr Trp Tyr Arg Asn Leu Arg 1525 1530 1535 HisArg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu 1540 15451550 Gly Phe Thr His Phe Val Glu Val Ser Ala His Pro Val Leu Thr Met1555 1560 1565 Ala Leu Pro Gly Thr Val Thr Gly Leu Ala Thr Leu Arg ArgAsp Asn 1570 1575 1580 Gly Gly Gln Asp Arg Leu Val Ala Ser Leu Ala GluAla Trp Ala Asn 1585 1590 1595 1600 Gly Leu Ala Val Asp Trp Ser Pro LeuLeu Pro Ser Ala Thr Gly His 1605 1610 1615 His Ser Asp Leu Pro Thr TyrAla Phe Gln Thr Glu Arg His Trp Leu 1620 1625 1630 Gly Glu Ile Glu AlaLeu Ala Pro Ala Gly Glu Pro Ala Val Gln Pro 1635 1640 1645 Ala Val LeuArg Thr Glu Ala Ala Glu Pro Ala Glu Leu Asp Arg Asp 1650 1655 1660 GluGln Leu Arg Val Ile Leu Asp Lys Val Arg Ala Gln Thr Ala Gln 1665 16701675 1680 Val Leu Gly Tyr Ala Thr Gly Gly Gln Ile Glu Val Asp Arg ThrPhe 1685 1690 1695 Arg Glu Ala Gly Cys Thr Ser Leu Thr Gly Val Asp LeuArg Asn Arg 1700 1705 1710 Ile Asn Ala Ala Phe Gly Val Arg Met Ala ProSer Met Ile Phe Asp 1715 1720 1725 Phe Pro Thr Pro Glu Ala Leu Ala GluGln Leu Leu Leu Val Val His 1730 1735 1740 Gly Glu Ala Ala Ala Asn ProAla Gly Ala Glu Pro Ala Pro Val Ala 1745 1750 1755 1760 Ala Ala Gly AlaVal Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys 1765 1770 1775 Arg LeuPro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val 1780 1785 1790Ala Gly Gly Gly Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp 17951800 1805 Asp Val Glu Gly Leu Tyr His Pro Asp Pro Glu His Pro Gly ThrSer 1810 1815 1820 Tyr Val Arg Gln Gly Gly Phe Ile Glu Asn Val Ala GlyPhe Asp Ala 1825 1830 1835 1840 Ala Phe Phe Gly Ile Ser Pro Arg Glu AlaLeu Ala Met Asp Pro Gln 1845 1850 1855 Gln Arg Leu Leu Leu Glu Thr SerTrp Glu Ala Val Glu Asp Ala Gly 1860 1865 1870 Ile Asp Pro Thr Ser LeuArg Gly Arg Gln Val Gly Val Phe Thr Gly 1875 1880 1885 Ala Met Thr HisGlu Tyr Gly Pro Ser Leu Arg Asp Gly Gly Glu Gly 1890 1895 1900 Leu AspGly Tyr Leu Leu Thr Gly Asn Thr Ala Ser Val Met Ser Gly 1905 1910 19151920 Arg Val Ser Tyr Thr Leu Gly Leu Glu Gly Pro Ala Leu Thr Val Asp1925 1930 1935 Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala ValGln Ala 1940 1945 1950 Leu Arg Lys Gly Glu Val Asp Met Ala Leu Ala GlyGly Val Ala Val 1955 1960 1965 Met Pro Thr Pro Gly Met Phe Val Glu PheSer Arg Gln Arg Gly Leu 1970 1975 1980 Ala Gly Asp Gly Arg Ser Lys AlaPhe Ala Ala Ser Ala Asp Gly Thr 1985 1990 1995 2000 Ser Trp Ser Glu GlyVal Gly Val Leu Leu Val Glu Arg Leu Ser Asp 2005 2010 2015 Ala Arg ArgAsn Gly His Gln Val Leu Ala Val Val Arg Gly Ser Ala 2020 2025 2030 LeuAsn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro 2035 20402045 Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr2050 2055 2060 Thr Ser Asp Val Asp Val Val Glu Ala His Gly Thr Gly ThrArg Leu 2065 2070 2075 2080 Gly Asp Pro Ile Glu Ala Gln Ala Leu Ile AlaThr Tyr Gly Gln Gly 2085 2090 2095 Arg Asp Asp Glu Gln Pro Leu Arg LeuGly Ser Leu Lys Ser Asn Ile 2100 2105 2110 Gly His Thr Gln Ala Ala AlaGly Val Ser Gly Val Ile Lys Met Val 2115 2120 2125 Gln Ala Met Arg HisGly Leu Leu Pro Lys Thr Leu His Val Asp Glu 2130 2135 2140 Pro Ser AspGln Ile Asp Trp Ser Ala Gly Ala Val Glu Leu Leu Thr 2145 2150 2155 2160Glu Ala Val Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu Arg Arg Ala 21652170 2175 Ala Val Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Val ValLeu 2180 2185 2190 Glu Glu Ala Pro Val Val Val Glu Gly Ala Ser Val ValGlu Pro Ser 2195 2200 2205 Val Gly Gly Ser Ala Val Gly Gly Gly Val ThrPro Trp Val Val Ser 2210 2215 2220 Ala Lys Ser Ala Ala Ala Leu Asp AlaGln Ile Glu Arg Leu Ala Ala 2225 2230 2235 2240 Phe Ala Ser Arg Asp ArgThr Asp Asp Ala Asp Ala Gly Ala Val Asp 2245 2250 2255 Ala Gly Ala ValAla His Val Leu Ala Asp Gly Arg Ala Gln Phe Glu 2260 2265 2270 His ArgAla Val Ala Leu Gly Ala Gly Ala Asp Asp Leu Val Gln Ala 2275 2280 2285Leu Ala Asp Pro Asp Gly Leu Ile Arg Gly Thr Ala Ser Gly Val Gly 22902295 2300 Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala GlyMet 2305 2310 2315 2320 Gly Ala Glu Leu Leu Asp Ser Ser Ala Val Phe AlaAla Ala Met Ala 2325 2330 2335 Glu Cys Glu Ala Ala Leu Ser Pro Tyr ValAsp Trp Ser Leu Glu Ala 2340 2345 2350 Val Val Arg Gln Ala Pro Gly AlaPro Thr Leu Glu Arg Val Asp Val 2355 2360 2365 Val Gln Pro Val Thr PheAla Val Met Val Ser Leu Ala Arg Val Trp 2370 2375 2380 Gln His His GlyVal Thr Pro Gln Ala Val Val Gly His Ser Gln Gly 2385 2390 2395 2400 GluIle Ala Ala Ala Tyr Val Ala Gly Ala Leu Pro Leu Asp Asp Ala 2405 24102415 Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala2420 2425 2430 Gly Lys Gly Gly Met Leu Ser Leu Ala Leu Asn Glu Asp AlaVal Leu 2435 2440 2445 Glu Arg Leu Ser Asp Phe Asp Gly Leu Ser Val AlaAla Val Asn Gly 2450 2455 2460 Pro Thr Ala Thr Val Val Ser Gly Asp ProVal Gln Ile Glu Glu Leu 2465 2470 2475 2480 Ala Gln Ala Cys Lys Ala AspGly Phe Arg Ala Arg Ile Ile Pro Val 2485 2490 2495 Asp Tyr Ala Ser HisSer Arg Gln Val Glu Ile Ile Glu Ser Glu Leu 2500 2505 2510 Ala Gln ValLeu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro Phe 2515 2520 2525 PheSer Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val Leu Asp Gly 2530 25352540 Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala2545 2550 2555 2560 Ile Glu Thr Leu Ala Val Asp Glu Gly Phe Thr His PheVal Glu Val 2565 2570 2575 Ser Ala His Pro Val Leu Thr Met Thr Leu ProGlu Thr Val Thr Gly 2580 2585 2590 Leu Gly Thr Leu Arg Arg Glu Gln GlyGly Gln Glu Arg Leu Val Thr 2595 2600 2605 Ser Leu Ala Glu Ala Trp ValAsn Gly Leu Pro Val Ala Trp Thr Ser 2610 2615 2620 Leu Leu Pro Ala ThrAla Ser Arg Pro Gly Leu Pro Thr Tyr Ala Phe 2625 2630 2635 2640 Gln AlaGlu Arg Tyr Trp Leu Glu Asn Thr Pro Ala Ala Leu Ala Thr 2645 2650 2655Gly Asp Asp Trp Arg Tyr Arg Ile Asp Trp Lys Arg Leu Pro Ala Ala 26602665 2670 Glu Gly Ser Glu Arg Thr Gly Leu Ser Gly Arg Trp Leu Ala ValThr 2675 2680 2685 Pro Glu Asp His Ser Ala Gln Ala Ala Ala Val Leu ThrAla Leu Val 2690 2695 2700 Asp Ala Gly Ala Lys Val Glu Val Leu Thr AlaGly Ala Asp Asp Asp 2705 2710 2715 2720 Arg Glu Ala Leu Ala Ala Arg LeuThr Ala Leu Thr Thr Gly Asp Gly 2725 2730 2735 Phe Thr Gly Val Val SerLeu Leu Asp Gly Leu Val Pro Gln Val Ala 2740 2745 2750 Trp Val Gln AlaLeu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp Ser 2755 2760 2765 Val ThrGln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro Ala Asp 2770 2775 2780Pro Asp Arg Ala Met Leu Trp Gly Leu Gly Arg Val Val Ala Leu Glu 27852790 2795 2800 His Pro Glu Arg Trp Ala Gly Leu Val Asp Leu Pro Ala GlnPro Asp 2805 2810 2815 Ala Ala Ala Leu Ala His Leu Val Thr Ala Leu SerGly Ala Thr Gly 2820 2825 2830 Glu Asp Gln Ile Ala Ile Arg Thr Thr GlyLeu His Ala Arg Arg Leu 2835 2840 2845 Ala Arg Ala Pro Leu His Gly ArgArg Pro Thr Arg Asp Trp Gln Pro 2850 2855 2860 His Gly Thr Val Leu IleThr Gly Gly Thr Gly Ala Leu Gly Ser His 2865 2870 2875 2880 Ala Ala ArgTrp Met Ala His His Gly Ala Glu His Leu Leu Leu Val 2885 2890 2895 SerArg Ser Gly Glu Gln Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu 2900 29052910 Leu Thr Ala Ser Gly Ala Arg Val Thr Ile Ala Ala Cys Asp Val Ala2915 2920 2925 Asp Pro His Ala Met Arg Thr Leu Leu Asp Ala Ile Pro AlaGlu Thr 2930 2935 2940 Pro Leu Thr Ala Val Val His Thr Ala Gly Ala LeuAsp Asp Gly Ile 2945 2950 2955 2960 Val Asp Thr Leu Thr Ala Glu Gln ValArg Arg Ala His Arg Ala Lys 2965 2970 2975 Ala Val Gly Ala Ser Val LeuAsp Glu Leu Thr Arg Asp Leu Asp Leu 2980 2985 2990 Asp Ala Phe Val LeuPhe Ser Ser Val Ser Ser Thr Leu Gly Ile Pro 2995 3000 3005 Gly Gln GlyAsn Tyr Ala Pro His Asn Ala Tyr Leu Asp Ala Leu Ala 3010 3015 3020 AlaArg Arg Arg Ala Thr Gly Arg Ser Ala Val Ser Val Ala Trp Gly 3025 30303035 3040 Pro Trp Asp Gly Gly Gly Met Ala Ala Gly Asp Gly Val Ala GluArg 3045 3050 3055 Leu Arg Asn His Gly Val Pro Gly Met Asp Pro Glu LeuAla Leu Ala 3060 3065 3070 Ala Leu Glu Ser Ala Leu Gly Arg Asp Glu ThrAla Ile Thr Val Ala 3075 3080 3085 Asp Ile Asp Trp Asp Arg Phe Tyr LeuAla Tyr Ser Ser Gly Arg Pro 3090 3095 3100 Gln Pro Leu Val Glu Glu LeuPro Glu Val Arg Arg Ile Ile Asp Ala 3105 3110 3115 3120 Arg Asp Ser AlaThr Ser Gly Gln Gly Gly Ser Ser Ala Gln Gly Ala 3125 3130 3135 Asn ProLeu Ala Glu Arg Leu Ala Ala Ala Ala Pro Gly Glu Arg Thr 3140 3145 3150Glu Ile Leu Leu Gly Leu Val Arg Ala Gln Ala Ala Ala Val Leu Arg 31553160 3165 Met Arg Ser Pro Glu Asp Val Ala Ala Asp Arg Ala Phe Lys AspIle 3170 3175 3180 Gly Phe Asp Ser Leu Ala Gly Val Glu Leu Arg Asn ArgLeu Thr Arg 3185 3190 3195 3200 Ala Thr Gly Leu Gln Leu Pro Ala Thr LeuVal Phe Asp His Pro Thr 3205 3210 3215 Pro Leu Ala Leu Val Ser Leu LeuArg Ser Glu Phe Leu Gly Asp Glu 3220 3225 3230 Glu Thr Ala Asp Ala ArgArg Ser Ala Ala Leu Pro Ala Thr Val Gly 3235 3240 3245 Ala Gly Ala GlyAla Gly Ala Gly Thr Asp Ala Asp Asp Asp Pro Ile 3250 3255 3260 Ala IleVal Ala Met Ser Cys Arg Tyr Pro Gly Asp Ile Arg Ser Pro 3265 3270 32753280 Glu Asp Leu Trp Arg Met Leu Ser Glu Gly Gly Glu Gly Ile Thr Pro3285 3290 3295 Phe Pro Thr Asp Arg Gly Trp Asp Leu Asp Gly Leu Tyr AspAla Asp 3300 3305 3310 Pro Asp Ala Leu Gly Arg Ala Tyr Val Arg Glu GlyGly Phe Leu His 3315 3320 3325 Asp Ala Ala Glu Phe Asp Ala Glu Phe PheGly Val Ser Pro Arg Glu 3330 3335 3340 Ala Leu Ala Met Asp Pro Gln GlnArg Met Leu Leu Thr Thr Ser Trp 3345 3350 3355 3360 Glu Ala Phe Glu ArgAla Gly Ile Glu Pro Ala Ser Leu Arg Gly Ser 3365 3370 3375 Ser Thr GlyVal Phe Ile Gly Leu Ser Tyr Gln Asp Tyr Ala Ala Arg 3380 3385 3390 ValPro Asn Ala Pro Arg Gly Val Glu Gly Tyr Leu Leu Thr Gly Ser 3395 34003405 Thr Pro Ser Val Ala Ser Gly Arg Ile Ala Tyr Thr Phe Gly Leu Glu3410 3415 3420 Gly Pro Ala Thr Thr Val Asp Thr Ala Cys Ser Ser Ser LeuThr Ala 3425 3430 3435 3440 Leu His Leu Ala Val Arg Ala Leu Arg Ser GlyGlu Cys Thr Met Ala 3445 3450 3455 Leu Ala Gly Gly Val Ala Met Met AlaThr Pro His Met Phe Val Glu 3460 3465 3470 Phe Ser Arg Gln Arg Ala LeuAla Pro Asp Gly Arg Ser Lys Ala Phe 3475 3480 3485 Ser Ala Asp Ala AspGly Phe Gly Ala Ala Glu Gly Val Gly Leu Leu 3490 3495 3500 Leu Val GluArg Leu Ser Asp Ala Arg Arg Asn Gly His Pro Val Leu 3505 3510 3515 3520Ala Val Val Arg Gly Thr Ala Val Asn Gln Asp Gly Ala Ser Asn Gly 35253530 3535 Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg GlnAla 3540 3545 3550 Leu Ala Asp Ala Arg Leu Ala Pro Gly Asp Ile Asp AlaVal Glu Thr 3555 3560 3565 His Gly Thr Gly Thr Ser Leu Gly Asp Pro IleGlu Ala Gln Gly Leu 3570 3575 3580 Gln Ala Thr Tyr Gly Lys Glu Arg ProAla Glu Arg Pro Leu Ala Ile 3585 3590 3595 3600 Gly Ser Val Lys Ser AsnIle Gly His Thr Gln Ala Ala Ala Gly Ala 3605 3610 3615 Ala Gly Ile IleLys Met Val Leu Ala Met Arg His Gly Thr Leu Pro 3620 3625 3630 Lys ThrLeu His Ala Asp Glu Pro Ser Pro His Val Asp Trp Ala Asn 3635 3640 3645Ser Gly Leu Ala Leu Val Thr Glu Pro Ile Asp Trp Pro Ala Gly Thr 36503655 3660 Gly Pro Arg Arg Ala Ala Val Ser Ser Phe Gly Ile Ser Gly ThrAsn 3665 3670 3675 3680 Ala His Val Val Leu Glu Gln Ala Pro Asp Ala AlaGly Glu Val Leu 3685 3690 3695 Gly Ala Asp Glu Val Pro Glu Val Ser GluThr Val Ala Met Ala Gly 3700 3705 3710 Thr Ala Gly Thr Ser Glu Val AlaGlu Gly Ser Glu Ala Ser Glu Ala 3715 3720 3725 Pro Ala Ala Pro Gly SerArg Glu Ala Ser Leu Pro Gly His Leu Pro 3730 3735 3740 Trp Val Leu SerAla Lys Asp Glu Gln Ser Leu Arg Gly Gln Ala Ala 3745 3750 3755 3760 AlaLeu His Ala Trp Leu Ser Glu Pro Ala Ala Asp Leu Ser Asp Ala 3765 37703775 Asp Gly Pro Ala Arg Leu Arg Asp Val Gly Tyr Thr Leu Ala Thr Ser3780 3785 3790 Arg Thr Ala Phe Ala His Arg Ala Ala Val Thr Ala Ala AspArg Asp 3795 3800 3805 Gly Phe Leu Asp Gly Leu Ala Thr Leu Ala Gln GlyGly Thr Ser Ala 3810 3815 3820 His Val His Leu Asp Thr Ala Arg Asp GlyThr Thr Ala Phe Leu Phe 3825 3830 3835 3840 Thr Gly Gln Gly Ser Gln ArgPro Gly Ala Gly Arg Glu Leu Tyr Asp 3845 3850 3855 Arg His Pro Val PheAla Arg Ala Leu Asp Glu Ile Cys Ala His Leu 3860 3865 3870 Asp Gly HisLeu Glu Leu Pro Leu Leu Asp Val Met Phe Ala Ala Glu 3875 3880 3885 GlySer Ala Glu Ala Ala Leu Leu Asp Glu Thr Arg Tyr Thr Gln Cys 3890 38953900 Ala Leu Phe Ala Leu Glu Val Ala Leu Phe Arg Leu Val Glu Ser Trp3905 3910 3915 3920 Gly Met Arg Pro Ala Ala Leu Leu Gly His Ser Val GlyGlu Ile Ala 3925 3930 3935 Ala Ala His Val Ala Gly Val Phe Ser Leu AlaAsp Ala Ala Arg Leu 3940 3945 3950 Val Ala Ala Arg Gly Arg Leu Met GlnGlu Leu Pro Ala Gly Gly Ala 3955 3960 3965 Met Leu Ala Val Gln Ala AlaGlu Asp Glu Ile Arg Val Trp Leu Glu 3970 3975 3980 Thr Glu Glu Arg TyrAla Gly Arg Leu Asp Val Ala Ala Val Asn Gly 3985 3990 3995 4000 Pro GluAla Ala Val Leu Ser Gly Asp Ala Asp Ala Ala Arg Glu Ala 4005 4010 4015Glu Ala Tyr Trp Ser Gly Leu Gly Arg Arg Thr Arg Ala Leu Arg Val 40204025 4030 Ser His Ala Phe His Ser Ala His Met Asp Gly Met Leu Asp GlyPhe 4035 4040 4045 Arg Ala Val Leu Glu Thr Val Glu Phe Arg Arg Pro SerLeu Thr Val 4050 4055 4060 Val Ser Asn Val Thr Gly Leu Ala Ala Gly ProAsp Asp Leu Cys Asp 4065 4070 4075 4080 Pro Glu Tyr Trp Val Arg His ValArg Gly Thr Val Arg Phe Leu Asp 4085 4090 4095 Gly Val Arg Val Leu ArgAsp Leu Gly Val Arg Thr Cys Leu Glu Leu 4100 4105 4110 Gly Pro Asp GlyVal Leu Thr Ala Met Ala Ala Asp Gly Leu Ala Asp 4115 4120 4125 Thr ProAla Asp Ser Ala Ala Gly Ser Pro Val Gly Ser Pro Ala Gly 4130 4135 4140Ser Pro Ala Asp Ser Ala Ala Gly Ala Leu Arg Pro Arg Pro Leu Leu 41454150 4155 4160 Val Ala Leu Leu Arg Arg Lys Arg Ser Glu Thr Glu Thr ValAla Asp 4165 4170 4175 Ala Leu Gly Arg Ala His Ala His Gly Thr Gly ProAsp Trp His Ala 4180 4185 4190 Trp Phe Ala Gly Ser Gly Ala His Arg ValAsp Leu Pro Thr Tyr Ser 4195 4200 4205 Phe Arg Arg Asp Arg Tyr Trp LeuAsp Ala Pro Ala Ala Asp Thr Ala 4210 4215 4220 Val Asp Thr Ala Gly LeuGly Leu Gly Thr Ala Asp His Pro Leu Leu 4225 4230 4235 4240 Gly Ala ValVal Ser Leu Pro Asp Arg Asp Gly Leu Leu Leu Thr Gly 4245 4250 4255 ArgLeu Ser Leu Arg Thr His Pro Trp Leu Ala Asp His Ala Val Leu 4260 42654270 Gly Ser Val Leu Leu Pro Gly Ala Ala Met Val Glu Leu Ala Ala His4275 4280 4285 Ala Ala Glu Ser Ala Gly Leu Arg Asp Val Arg Glu Leu ThrLeu Leu 4290 4295 4300 Glu Pro Leu Val Leu Pro Glu His Gly Gly Val GluLeu Arg Val Thr 4305 4310 4315 4320 Val Gly Ala Pro Ala Gly Glu Pro GlyGly Glu Ser Ala Gly Asp Gly 4325 4330 4335 Ala Arg Pro Val Ser Leu HisSer Arg Leu Ala Asp Ala Pro Ala Gly 4340 4345 4350 Thr Ala Trp Ser CysHis Ala Thr Gly Leu Leu Ala Thr Asp Arg Pro 4355 4360 4365 Glu Leu ProVal Ala Pro Asp Arg Ala Ala Met Trp Pro Pro Gln Gly 4370 4375 4380 AlaGlu Glu Val Pro Leu Asp Gly Leu Tyr Glu Arg Leu Asp Gly Asn 4385 43904395 4400 Gly Leu Ala Phe Gly Pro Leu Phe Gln Gly Leu Asn Ala Val TrpArg 4405 4410 4415 Tyr Glu Gly Glu Val Phe Ala Asp Ile Ala Leu Pro AlaThr Thr Asn 4420 4425 4430 Ala Thr Ala Pro Ala Thr Ala Asn Gly Gly GlySer Ala Ala Ala Ala 4435 4440 4445 Pro Tyr Gly Ile His Pro Ala Leu LeuAsp Ala Ser Leu His Ala Ile 4450 4455 4460 Ala Val Gly Gly Leu Val AspGlu Pro Glu Leu Val Arg Val Pro Phe 4465 4470 4475 4480 His Trp Ser GlyVal Thr Val His Ala Ala Gly Ala Ala Ala Ala Arg 4485 4490 4495 Val ArgLeu Ala Ser Ala Gly Thr Asp Ala Val Ser Leu Ser Leu Thr 4500 4505 4510Asp Gly Glu Gly Arg Pro Leu Val Ser Val Glu Arg Leu Thr Leu Arg 45154520 4525 Pro Val Thr Ala Asp Gln Ala Ala Ala Ser Arg Val Gly Gly LeuMet 4530 4535 4540 His Arg Val Ala Trp Arg Pro Tyr Ala Leu Ala Ser SerGly Glu Gln 4545 4550 4555 4560 Asp Pro His Ala Thr Ser Tyr Gly Pro ThrAla Val Leu Gly Lys Asp 4565 4570 4575 Glu Leu Lys Val Ala Ala Ala LeuGlu Ser Ala Gly Val Glu Val Gly 4580 4585 4590 Leu Tyr Pro Asp Leu AlaAla Leu Ser Gln Asp Val Ala Ala Gly Ala 4595 4600 4605 Pro Ala Pro ArgThr Val Leu Ala Pro Leu Pro Ala Gly Pro Ala Asp 4610 4615 4620 Gly GlyAla Glu Gly Val Arg Gly Thr Val Ala Arg Thr Leu Glu Leu 4625 4630 46354640 Leu Gln Ala Trp Leu Ala Asp Glu His Leu Ala Gly Thr Arg Leu Leu4645 4650 4655 Leu Val Thr Arg Gly Ala Val Arg Asp Pro Glu Gly Ser GlyAla Asp 4660 4665 4670 Asp Gly Gly Glu Asp Leu Ser His Ala Ala Ala TrpGly Leu Val Arg 4675 4680 4685 Thr Ala Gln Thr Glu Asn Pro Gly Arg PheGly Leu Leu Asp Leu Ala 4690 4695 4700 Asp Asp Ala Ser Ser Tyr Arg ThrLeu Pro Ser Val Leu Ser Asp Ala 4705 4710 4715 4720 Gly Leu Arg Asp GluPro Gln Leu Ala Leu His Asp Gly Thr Ile Arg 4725 4730 4735 Leu Ala ArgLeu Ala Ser Val Arg Pro Glu Thr Gly Thr Ala Ala Pro 4740 4745 4750 AlaLeu Ala Pro Glu Gly Thr Val Leu Leu Thr Gly Gly Thr Gly Gly 4755 47604765 Leu Gly Gly Leu Val Ala Arg His Val Val Gly Glu Trp Gly Val Arg4770 4775 4780 Arg Leu Leu Leu Val Ser Arg Arg Gly Thr Asp Ala Pro GlyAla Asp 4785 4790 4795 4800 Glu Leu Val His Glu Leu Glu Ala Leu Gly AlaAsp Val Ser Val Ala 4805 4810 4815 Ala Cys Asp Val Ala Asp Arg Glu AlaLeu Thr Ala Val Leu Asp Ala 4820 4825 4830 Ile Pro Ala Glu His Pro LeuThr Ala Val Val His Thr Ala Gly Val 4835 4840 4845 Leu Ser Asp Gly ThrLeu Pro Ser Met Thr Thr Glu Asp Val Glu His 4850 4855 4860 Val Leu ArgPro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu Thr 4865 4870 4875 4880Ser Thr Pro Ala Tyr Asp Leu Ala Ala Phe Val Met Phe Ser Ser Ala 48854890 4895 Ala Ala Val Phe Gly Gly Ala Gly Gln Gly Ala Tyr Ala Ala AlaAsn 4900 4905 4910 Ala Thr Leu Asp Ala Leu Ala Trp Arg Arg Arg Ala AlaGly Leu Pro 4915 4920 4925 Ala Leu Ser Leu Gly Trp Gly Leu Trp Ala GluThr Ser Gly Met Thr 4930 4935 4940 Gly Glu Leu Gly Gln Ala Asp Leu ArgArg Met Ser Arg Ala Gly Ile 4945 4950 4955 4960 Gly Gly Ile Ser Asp AlaGlu Gly Ile Ala Leu Leu Asp Ala Ala Leu 4965 4970 4975 Arg Asp Asp ArgHis Pro Val Leu Leu Pro Leu Arg Leu Asp Ala Ala 4980 4985 4990 Gly LeuArg Asp Ala Ala Gly Asn Asp Pro Ala Gly Ile Pro Ala Leu 4995 5000 5005Phe Arg Asp Val Val Gly Ala Arg Thr Val Arg Ala Arg Pro Ser Ala 50105015 5020 Ala Ser Ala Ser Thr Thr Ala Gly Thr Ala Gly Thr Pro Gly ThrAla 5025 5030 5035 5040 Asp Gly Ala Ala Glu Thr Ala Ala Val Thr Leu AlaAsp Arg Ala Ala 5045 5050 5055 Thr Val Asp Gly Pro Ala Arg Gln Arg LeuLeu Leu Glu Phe Val Val 5060 5065 5070 Gly Glu Val Ala Glu Val Leu GlyHis Ala Arg Gly His Arg Ile Asp 5075 5080 5085 Ala Glu Arg Gly Phe LeuAsp Leu Gly Phe Asp Ser Leu Thr Ala Val 5090 5095 5100 Glu Leu Arg AsnArg Leu Asn Ser Ala Gly Gly Leu Ala Leu Pro Ala 5105 5110 5115 5120 ThrLeu Val Phe Asp His Pro Ser Pro Ala Ala Leu Ala Ser His Leu 5125 51305135 Asp Ala Glu Leu Pro Arg Gly Ala Ser Asp Gln Asp Gly Ala Gly Asn5140 5145 5150 Arg Asn Gly Asn Glu Asn Gly Thr Thr Ala Ser Arg Ser ThrAla Glu 5155 5160 5165 Thr Asp Ala Leu Leu Ala Gln Leu Thr Arg Leu GluGly Ala Leu Val 5170 5175 5180 Leu Thr Gly Leu Ser Asp Ala Pro Gly SerGlu Glu Val Leu Glu His 5185 5190 5195 5200 Leu Arg Ser Leu Arg Ser MetVal Thr Gly Glu Thr Gly Thr Gly Thr 5205 5210 5215 Ala Ser Gly Ala ProAsp Gly Ala Gly Ser Gly Ala Glu Asp Arg Pro 5220 5225 5230 Trp Ala AlaGly Asp Gly Ala Gly Gly Gly Ser Glu Asp Gly Ala Gly 5235 5240 5245 ValPro Asp Phe Met Asn Ala Ser Ala Glu Glu Leu Phe Gly Leu Leu 5250 52555260 Asp Gln Asp Pro Ser Thr Asp Met Ser Thr Val Asn Glu Glu Lys Tyr5265 5270 5275 5280 Leu Asp Tyr Leu Arg Arg Ala Thr Ala Asp Leu His GluAla Arg Gly 5285 5290 5295 Arg Leu Arg Glu Leu Glu Ala Lys Ala Gly GluPro Val Ala Ile Val 5300 5305 5310 Gly Met Ala Cys Arg Leu Pro Gly GlyVal Ala Ser Pro Glu Asp Leu 5315 5320 5325 Trp Arg Leu Val Ala Gly GlyGlu Asp Ala Ile Ser Glu Phe Pro Gln 5330 5335 5340 Asp Arg Gly Trp AspVal Glu Gly Leu Tyr Asp Pro Asn Pro Glu Ala 5345 5350 5355 5360 Thr GlyLys Ser Tyr Ala Arg Glu Ala Gly Phe Leu Tyr Glu Ala Gly 5365 5370 5375Glu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala 53805385 5390 Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Ala Ser Trp Glu AlaPhe 5395 5400 5405 Glu His Ala Gly Ile Pro Ala Ala Thr Ala Arg Gly ThrSer Val Gly 5410 5415 5420 Val Phe Thr Gly Val Met Tyr His Asp Tyr AlaThr Arg Leu Thr Asp 5425 5430 5435 5440 Val Pro Glu Gly Ile Glu Gly TyrLeu Gly Thr Gly Asn Ser Gly Ser 5445 5450 5455 Val Ala Ser Gly Arg ValAla Tyr Thr Leu Gly Leu Glu Gly Pro Ala 5460 5465 5470 Val Thr Val AspThr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 5475 5480 5485 Ala ValGln Ala Leu Arg Lys Gly Glu Val Asp Met Ala Leu Ala Gly 5490 5495 5500Gly Val Thr Val Met Ser Thr Pro Ser Thr Phe Val Glu Phe Ser Arg 55055510 5515 5520 Gln Arg Gly Leu Ala Pro Asp Gly Arg Ser Lys Ser Phe SerSer Thr 5525 5530 5535 Ala Asp Gly Thr Ser Trp Ser Glu Gly Val Gly ValLeu Leu Val Glu 5540 5545 5550 Arg Leu Ser Asp Ala Arg Arg Lys Gly HisArg Ile Leu Ala Val Val 5555 5560 5565 Arg Gly Thr Ala Val Asn Gln AspGly Ala Ser Ser Gly Leu Thr Ala 5570 5575 5580 Pro Asn Gly Pro Ser GlnGln Arg Val Ile Arg Arg Ala Leu Ala Asp 5585 5590 5595 5600 Ala Arg LeuThr Thr Ser Asp Val Asp Val Val Glu Ala His Gly Thr 5605 5610 5615 GlyThr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Val Ile Ala Thr 5620 56255630 Tyr Gly Gln Gly Arg Asp Gly Glu Gln Pro Leu Arg Leu Gly Ser Leu5635 5640 5645 Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val SerGly Val 5650 5655 5660 Ile Lys Met Val Gln Ala Met Arg His Gly Val LeuPro Lys Thr Leu 5665 5670 5675 5680 His Val Glu Lys Pro Thr Asp Gln ValAsp Trp Ser Ala Gly Ala Val 5685 5690 5695 Glu Leu Leu Thr Glu Ala MetAsp Trp Pro Asp Lys Gly Asp Gly Gly 5700 5705 5710 Leu Arg Arg Ala AlaVal Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 5715 5720 5725 His Val ValLeu Glu Glu Ala Pro Ala Ala Glu Glu Thr Pro Ala Ser 5730 5735 5740 GluAla Thr Pro Ala Val Glu Pro Ser Val Gly Ala Gly Leu Val Pro 5745 57505755 5760 Trp Leu Val Ser Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln IleGly 5765 5770 5775 Arg Leu Ala Ala Phe Ala Ser Gln Gly Arg Thr Asp AlaAla Asp Pro 5780 5785 5790 Gly Ala Val Ala Arg Val Leu Ala Gly Gly ArgAla Glu Phe Glu His 5795 5800 5805 Arg Ala Val Val Leu Gly Thr Gly GlnAsp Asp Phe Ala Gln Ala Leu 5810 5815 5820 Thr Ala Pro Glu Gly Leu IleArg Gly Thr Pro Ser Asp Val Gly Arg 5825 5830 5835 5840 Val Ala Phe ValPhe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly 5845 5850 5855 Ala GluLeu Leu Asp Val Ser Lys Glu Phe Ala Ala Ala Met Ala Glu 5860 5865 5870Cys Glu Ser Ala Leu Ser Arg Tyr Val Asp Trp Ser Leu Glu Ala Val 58755880 5885 Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp ValVal 5890 5895 5900 Gln Pro Val Thr Phe Ala Val Met Val Ser Leu Ala LysVal Trp Gln 5905 5910 5915 5920 His His Gly Val Thr Pro Gln Ala Val ValGly His Ser Gln Gly Glu 5925 5930 5935 Ile Ala Ala Ala Tyr Val Ala GlyAla Leu Thr Leu Asp Asp Ala Ala 5940 5945 5950 Arg Val Val Thr Leu ArgSer Lys Ser Ile Ala Ala His Leu Ala Gly 5955 5960 5965 Lys Gly Gly MetIle Ser Leu Ala Leu Ser Glu Glu Ala Thr Arg Gln 5970 5975 5980 Arg IleGlu Asn Leu His Gly Leu Ser Ile Ala Ala Val Asn Gly Pro 5985 5990 59956000 Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Gln Glu Leu Ala6005 6010 6015 Gln Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile ProVal Asp 6020 6025 6030 Tyr Ala Ser His Ser Ala His Val Glu Thr Ile GluSer Glu Leu Ala 6035 6040 6045 Glu Val Leu Ala Gly Leu Ser Pro Arg ThrPro Glu Val Pro Phe Phe 6050 6055 6060 Ser Thr Leu Glu Gly Ala Trp IleThr Glu Pro Val Leu Asp Gly Thr 6065 6070 6075 6080 Tyr Trp Tyr Arg AsnLeu Arg His Arg Val Gly Phe Ala Pro Ala Val 6085 6090 6095 Glu Thr LeuAla Thr Asp Glu Gly Phe Thr His Phe Ile Glu Val Ser 6100 6105 6110 AlaHis Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val Thr Gly Leu 6115 61206125 Gly Thr Leu Arg Arg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr Ser6130 6135 6140 Leu Ala Glu Ala Trp Thr Asn Gly Leu Thr Ile Asp Trp AlaPro Val 6145 6150 6155 6160 Leu Pro Thr Ala Thr Gly His His Pro Glu LeuPro Thr Tyr Ala Phe 6165 6170 6175 Gln Arg Arg His Tyr Trp Leu His AspSer Pro Ala Val Gln Gly Ser 6180 6185 6190 Val Gln Asp Ser Trp Arg TyrArg Ile Asp Trp Lys Arg Leu Ala Val 6195 6200 6205 Ala Asp Ala Ser GluArg Ala Gly Leu Ser Gly Arg Trp Leu Val Val 6210 6215 6220 Val Pro GluAsp Arg Ser Ala Glu Ala Ala Pro Val Leu Ala Ala Leu 6225 6230 6235 6240Ser Gly Ala Gly Ala Asp Pro Val Gln Leu Asp Val Ser Pro Leu Gly 62456250 6255 Asp Arg Gln Arg Leu Ala Ala Thr Leu Gly Glu Ala Leu Ala AlaAla 6260 6265 6270 Gly Gly Ala Val Asp Gly Val Leu Ser Leu Leu Ala TrpAsp Glu Ser 6275 6280 6285 Ala His Pro Gly His Pro Ala Pro Phe Thr ArgGly Thr Gly Ala Thr 6290 6295 6300 Leu Thr Leu Val Gln Ala Leu Glu AspAla Gly Val Ala Ala Pro Leu 6305 6310 6315 6320 Trp Cys Val Thr His GlyAla Val Ser Val Gly Arg Ala Asp His Val 6325 6330 6335 Thr Ser Pro AlaGln Ala Met Val Trp Gly Met Gly Arg Val Ala Ala 6340 6345 6350 Leu GluHis Pro Glu Arg Trp Gly Gly Leu Ile Asp Leu Pro Ser Asp 6355 6360 6365Ala Asp Arg Ala Ala Leu Asp Arg Met Thr Thr Val Leu Ala Gly Gly 63706375 6380 Thr Gly Glu Asp Gln Val Ala Val Arg Ala Ser Gly Leu Leu AlaArg 6385 6390 6395 6400 Arg Leu Val Arg Ala Ser Leu Pro Ala His Gly ThrAla Ser Pro Trp 6405 6410 6415 Trp Gln Ala Asp Gly Thr Val Leu Val ThrGly Ala Glu Glu Pro Ala 6420 6425 6430 Ala Ala Glu Ala Ala Arg Arg LeuAla Arg Asp Gly Ala Gly His Leu 6435 6440 6445 Leu Leu His Thr Thr ProSer Gly Ser Glu Gly Ala Glu Gly Thr Ser 6450 6455 6460 Gly Ala Ala GluAsp Ser Gly Leu Ala Gly Leu Val Ala Glu Leu Ala 6465 6470 6475 6480 AspLeu Gly Ala Thr Ala Thr Val Val Thr Cys Asp Leu Thr Asp Ala 6485 64906495 Glu Ala Ala Ala Arg Leu Leu Ala Gly Val Ser Asp Ala His Pro Leu6500 6505 6510 Ser Ala Val Leu His Leu Pro Pro Thr Val Asp Ser Glu ProLeu Ala 6515 6520 6525 Ala Thr Asp Ala Asp Ala Leu Ala Arg Val Val ThrAla Lys Ala Thr 6530 6535 6540 Ala Ala Leu His Leu Asp Arg Leu Leu ArgGlu Ala Ala Ala Ala Gly 6545 6550 6555 6560 Gly Arg Pro Pro Val Leu ValLeu Phe Ser Ser Val Ala Ala Ile Trp 6565 6570 6575 Gly Gly Ala Gly GlnGly Ala Tyr Ala Ala Gly Thr Ala Phe Leu Asp 6580 6585 6590 Ala Leu AlaGly Gln His Arg Ala Asp Gly Pro Thr Val Thr Ser Val 6595 6600 6605 AlaTrp Ser Pro Trp Glu Gly Ser Arg Val Thr Glu Gly Ala Thr Gly 6610 66156620 Glu Arg Leu Arg Arg Leu Gly Leu Arg Pro Leu Ala Pro Ala Thr Ala6625 6630 6635 6640 Leu Thr Ala Leu Asp Thr Ala Leu Gly His Gly Asp ThrAla Val Thr 6645 6650 6655 Ile Ala Asp Val Asp Trp Ser Ser Phe Ala ProGly Phe Thr Thr Ala 6660 6665 6670 Arg Pro Gly Thr Leu Leu Ala Asp LeuPro Glu Ala Arg Arg Ala Leu 6675 6680 6685 Asp Glu Gln Gln Ser Thr ThrAla Ala Asp Asp Thr Val Leu Ser Arg 6690 6695 6700 Glu Leu Gly Ala LeuThr Gly Ala Glu Gln Gln Arg Arg Met Gln Glu 6705 6710 6715 6720 Leu ValArg Glu His Leu Ala Val Val Leu Asn His Pro Ser Pro Glu 6725 6730 6735Ala Val Asp Thr Gly Arg Ala Phe Arg Asp Leu Gly Phe Asp Ser Leu 67406745 6750 Thr Ala Val Glu Leu Arg Asn Arg Leu Lys Asn Ala Thr Gly LeuAla 6755 6760 6765 Leu Pro Ala Thr Leu Val Phe Asp Tyr Pro Thr Pro ArgThr Leu Ala 6770 6775 6780 Glu Phe Leu Leu Ala Glu Ile Leu Gly Glu GlnAla Gly Ala Gly Glu 6785 6790 6795 6800 Gln Leu Pro Val Asp Gly Gly ValAsp Asp Glu Pro Val Ala Ile Val 6805 6810 6815 Gly Met Ala Cys Arg LeuPro Gly Gly Val Ala Ser Pro Glu Asp Leu 6820 6825 6830 Trp Arg Leu ValAla Gly Gly Glu Asp Ala Ile Ser Gly Phe Pro Gln 6835 6840 6845 Asp ArgGly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asp Pro Asp Ala 6850 6855 6860Ser Gly Arg Thr Tyr Cys Arg Ala Gly Gly Phe Leu Asp Glu Ala Gly 68656870 6875 6880 Glu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu AlaLeu Ala 6885 6890 6895 Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr SerTrp Glu Ala Val 6900 6905 6910 Glu Asp Ala Gly Ile Asp Pro Thr Ser LeuGln Gly Gln Gln Val Gly 6915 6920 6925 Val Phe Ala Gly Thr Asn Gly ProHis Tyr Glu Pro Leu Leu Arg Asn 6930 6935 6940 Thr Ala Glu Asp Leu GluGly Tyr Val Gly Thr Gly Asn Ala Ala Ser 6945 6950 6955 6960 Ile Met SerGly Arg Val Ser Tyr Thr Leu Gly Leu Glu Gly Pro Ala 6965 6970 6975 ValThr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 6980 69856990 Ala Val Gln Ala Leu Arg Lys Gly Glu Cys Gly Leu Ala Leu Ala Gly6995 7000 7005 Gly Val Thr Val Met Ser Thr Pro Thr Thr Phe Val Glu PheSer Arg 7010 7015 7020 Gln Arg Gly Leu Ala Glu Asp Gly Arg Ser Lys AlaPhe Ala Ala Ser 7025 7030 7035 7040 Ala Asp Gly Phe Gly Pro Ala Glu GlyVal Gly Met Leu Leu Val Glu 7045 7050 7055 Arg Leu Ser Asp Ala Arg ArgAsn Gly His Arg Val Leu Ala Val Val 7060 7065 7070 Arg Gly Ser Ala ValAsn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala 7075 7080 7085 Pro Asn GlyPro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp 7090 7095 7100 AlaArg Leu Thr Thr Ala Asp Val Asp Val Val Glu Ala His Gly Thr 7105 71107115 7120 Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu Ile AlaThr 7125 7130 7135 Tyr Gly Gln Gly Arg Asp Thr Glu Gln Pro Leu Arg LeuGly Ser Leu 7140 7145 7150 Lys Ser Asn Ile Gly His Thr Gln Ala Ala AlaGly Val Ser Gly Ile 7155 7160 7165 Ile Lys Met Val Gln Ala Met Arg HisGly Val Leu Pro Lys Thr Leu 7170 7175 7180 His Val Asp Arg Pro Ser AspGln Ile Asp Trp Ser Ala Gly Thr Val 7185 7190 7195 7200 Glu Leu Leu ThrGlu Ala Met Asp Trp Pro Arg Lys Gln Glu Gly Gly 7205 7210 7215 Leu ArgArg Ala Ala Val Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala 7220 7225 7230His Ile Val Leu Glu Glu Ala Pro Val Asp Glu Asp Ala Pro Ala Asp 72357240 7245 Glu Pro Ser Val Gly Gly Val Val Pro Trp Leu Val Ser Ala LysThr 7250 7255 7260 Pro Ala Ala Leu Asp Ala Gln Ile Gly Arg Leu Ala AlaPhe Ala Ser 7265 7270 7275 7280 Gln Gly Arg Thr Asp Ala Ala Asp Pro GlyAla Val Ala Arg Val Leu 7285 7290 7295 Ala Gly Gly Arg Ala Gln Phe GluHis Arg Ala Val Ala Leu Gly Thr 7300 7305 7310 Gly Gln Asp Asp Leu AlaAla Ala Leu Ala Ala Pro Glu Gly Leu Val 7315 7320 7325 Arg Gly Val AlaSer Gly Val Gly Arg Val Ala Phe Val Phe Pro Gly 7330 7335 7340 Gln GlyThr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Val Ser 7345 7350 73557360 Lys Glu Phe Ala Ala Ala Met Ala Glu Cys Glu Ala Ala Leu Ala Pro7365 7370 7375 Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala ProGly Ala 7380 7385 7390 Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro ValThr Phe Ala Val 7395 7400 7405 Met Val Ser Leu Ala Lys Val Trp Gln HisHis Gly Val Thr Pro Gln 7410 7415 7420 Ala Val Val Gly His Ser Gln GlyGlu Ile Ala Ala Ala Tyr Val Ala 7425 7430 7435 7440 Gly Ala Leu Ser LeuAsp Asp Ala Ala Arg Val Val Thr Leu Arg Ser 7445 7450 7455 Lys Ser IleGly Ala His Leu Ala Gly Gln Gly Gly Met Leu Ser Leu 7460 7465 7470 AlaLeu Ser Glu Ala Ala Val Val Glu Arg Leu Ala Gly Phe Asp Gly 7475 74807485 Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly7490 7495 7500 Asp Pro Thr Gln Ile Gln Glu Leu Ala Gln Ala Cys Glu AlaAsp Gly 7505 7510 7515 7520 Val Arg Ala Arg Ile Ile Pro Val Asp Tyr AlaSer His Ser Ala His 7525 7530 7535 Val Glu Thr Ile Glu Ser Glu Leu AlaAsp Val Leu Ala Gly Leu Ser 7540 7545 7550 Pro Gln Thr Pro Gln Val ProPhe Phe Ser Thr Leu Glu Gly Ala Trp 7555 7560 7565 Ile Thr Glu Pro AlaLeu Asp Gly Gly Tyr Trp Tyr Arg Asn Leu Arg 7570 7575 7580 His Arg ValGly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu 7585 7590 7595 7600Gly Phe Thr His Phe Val Glu Val Ser Ala His Pro Val Leu Thr Met 76057610 7615 Ala Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg AspAsn 7620 7625 7630 Gly Gly Gln His Arg Leu Thr Thr Ser Leu Ala Glu AlaTrp Ala Asn 7635 7640 7645 Gly Leu Thr Val Asp Trp Ala Ser Leu Leu ProThr Thr Thr Thr His 7650 7655 7660 Pro Asp Leu Pro Thr Tyr Ala Phe GlnThr Glu Arg Tyr Trp Pro Gln 7665 7670 7675 7680 Pro Asp Leu Ser Ala AlaGly Asp Ile Thr Ser Ala Gly Leu Gly Ala 7685 7690 7695 Ala Glu His ProLeu Leu Gly Ala Ala Val Ala Leu Ala Asp Ser Asp 7700 7705 7710 Gly CysLeu Leu Thr Gly Ser Leu Ser Leu Arg Thr His Pro Trp Leu 7715 7720 7725Ala Asp His Ala Val Ala Gly Thr Val Leu Leu Pro Gly Thr Ala Phe 77307735 7740 Val Glu Leu Ala Phe Arg Ala Gly Asp Gln Val Gly Cys Asp LeuVal 7745 7750 7755 7760 Glu Glu Leu Thr Leu Asp Ala Pro Leu Val Leu ProArg Arg Gly Ala 7765 7770 7775 Val Arg Val Gln Leu Ser Val Gly Ala SerAsp Glu Ser Gly Arg Arg 7780 7785 7790 Thr Phe Gly Leu Tyr Ala His ProGlu Asp Ala Pro Gly Glu Ala Glu 7795 7800 7805 Trp Thr Arg His Ala ThrGly Val Leu Ala Ala Arg Ala Asp Arg Thr 7810 7815 7820 Ala Pro Val AlaAsp Pro Glu Ala Trp Pro Pro Pro Gly Ala Glu Pro 7825 7830 7835 7840 ValAsp Val Asp Gly Leu Tyr Glu Arg Phe Ala Ala Asn Gly Tyr Gly 7845 78507855 Tyr Gly Pro Leu Phe Gln Gly Val Arg Gly Val Trp Arg Arg Gly Asp7860 7865 7870 Glu Val Phe Ala Asp Val Ala Leu Pro Ala Glu Val Ala GlyAla Glu 7875 7880 7885 Gly Ala Arg Phe Gly Leu His Pro Ala Leu Leu AspAla Ala Val Gln 7890 7895 7900 Ala Ala Gly Ala Gly Arg Gly Val Arg ArgGly His Ala Ala Ala Val 7905 7910 7915 7920 Arg Leu Glu Arg Asp Leu LeuTyr Ala Val Gly Ala Thr Ala Leu Arg 7925 7930 7935 Val Arg Leu Ala ProAla Gly Pro Asp Thr Val Ser Val Ser Ala Ala 7940 7945 7950 Asp Ser SerGly Gln Pro Val Phe Ala Ala Asp Ser Leu Thr Val Leu 7955 7960 7965 ProVal Asp Pro Ala Gln Leu Ala Ala Phe Ser Asp Pro Thr Leu Asp 7970 79757980 Ala Leu His Leu Leu Glu Trp Thr Ala Trp Asp Gly Ala Ala Gln Ala7985 7990 7995 8000 Leu Pro Gly Ala Val Val Leu Gly Gly Asp Ala Asp GlyLeu Ala Ala 8005 8010 8015 Ala Leu Arg Ala Gly Gly Thr Glu Val Leu SerPhe Pro Asp Leu Thr 8020 8025 8030 Asp Leu Val Glu Ala Val Asp Arg GlyGlu Thr Pro Ala Pro Ala Thr 8035 8040 8045 Val Leu Val Ala Cys Pro AlaAla Gly Pro Asp Gly Pro Glu His Val 8050 8055 8060 Arg Glu Ala Leu HisGly Ser Leu Ala Leu Met Gln Ala Trp Leu Ala 8065 8070 8075 8080 Asp GluArg Phe Thr Asp Gly Arg Leu Val Leu Val Thr Arg Asp Ala 8085 8090 8095Val Ala Ala Arg Ser Gly Asp Gly Leu Arg Ser Thr Gly Gln Ala Ala 81008105 8110 Val Trp Gly Leu Gly Arg Ser Ala Gln Thr Glu Ser Pro Gly ArgPhe 8115 8120 8125 Val Leu Leu Asp Leu Ala Gly Glu Ala Arg Thr Ala GlyAsp Ala Thr 8130 8135 8140 Ala Gly Asp Gly Leu Thr Thr Gly Asp Ala ThrVal Gly Gly Thr Ser 8145 8150 8155 8160 Gly Asp Ala Ala Leu Gly Ser AlaLeu Ala Thr Ala Leu Gly Ser Gly 8165 8170 8175 Glu Pro Gln Leu Ala LeuArg Asp Gly Ala Leu Leu Val Pro Arg Leu 8180 8185 8190 Ala Arg Ala AlaAla Pro Ala Ala Ala Asp Gly Leu Ala Ala Ala Asp 8195 8200 8205 Gly LeuAla Ala Leu Pro Leu Pro Ala Ala Pro Ala Leu Trp Arg Leu 8210 8215 8220Glu Pro Gly Thr Asp Gly Ser Leu Glu Ser Leu Thr Ala Ala Pro Gly 82258230 8235 8240 Asp Ala Glu Thr Leu Ala Pro Glu Pro Leu Gly Pro Gly GlnVal Arg 8245 8250 8255 Ile Ala Ile Arg Ala Thr Gly Leu Asn Phe Arg AspVal Leu Ile Ala 8260 8265 8270 Leu Gly Met Tyr Pro Asp Pro Ala Leu MetGly Thr Glu Gly Ala Gly 8275 8280 8285 Val Val Thr Ala Thr Gly Pro GlyVal Thr His Leu Ala Pro Gly Asp 8290 8295 8300 Arg Val Met Gly Leu LeuSer Gly Ala Tyr Ala Pro Val Val Val Ala 8305 8310 8315 8320 Asp Ala ArgThr Val Ala Arg Met Pro Glu Gly Trp Thr Phe Ala Gln 8325 8330 8335 GlyAla Ser Val Pro Val Val Phe Leu Thr Ala Val Tyr Ala Leu Arg 8340 83458350 Asp Leu Ala Asp Val Lys Pro Gly Glu Arg Leu Leu Val His Ser Ala8355 8360 8365 Ala Gly Gly Val Gly Met Ala Ala Val Gln Leu Ala Arg HisTrp Gly 8370 8375 8380 Val Glu Val His Gly Thr Ala Ser His Gly Lys TrpAsp Ala Leu Arg 8385 8390 8395 8400 Ala Leu Gly Leu Asp Asp Ala His IleAla Ser Ser Arg Thr Leu Asp 8405 8410 8415 Phe Glu Ser Ala Phe Arg AlaAla Ser Gly Gly Ala Gly Met Asp Val 8420 8425 8430 Val Leu Asn Ser LeuAla Arg Glu Phe Val Asp Ala Ser Leu Arg Leu 8435 8440 8445 Leu Gly ProGly Gly Arg Phe Val Glu Met Gly Lys Thr Asp Val Arg 8450 8455 8460 AspAla Glu Arg Val Ala Ala Asp His Pro Gly Val Gly Tyr Arg Ala 8465 84708475 8480 Phe Asp Leu Gly Glu Ala Gly Pro Glu Arg Ile Gly Glu Met LeuAla 8485 8490 8495 Glu Val Ile Ala Leu Phe Glu Asp Gly Val Leu Arg HisLeu Pro Val 8500 8505 8510 Thr Thr Trp Asp Val Arg Arg Ala Arg Asp AlaPhe Arg His Val Ser 8515 8520 8525 Gln Ala Arg His Thr Gly Lys Val ValLeu Thr Met Pro Ser Gly Leu 8530 8535 8540 Asp Pro Glu Gly Thr Val LeuLeu Thr Gly Gly Thr Gly Ala Leu Gly 8545 8550 8555 8560 Gly Ile Val AlaArg His Val Val Gly Glu Trp Gly Val Arg Arg Leu 8565 8570 8575 Leu LeuVal Ser Arg Arg Gly Thr Asp Ala Pro Gly Ala Gly Glu Leu 8580 8585 8590Val His Glu Leu Glu Ala Leu Gly Ala Asp Val Ser Val Ala Ala Cys 85958600 8605 Asp Val Ala Asp Arg Glu Ala Leu Thr Ala Val Leu Asp Ser IlePro 8610 8615 8620 Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala GlyVal Leu Ser 8625 8630 8635 8640 Asp Gly Thr Leu Pro Ser Met Thr Ala GluAsp Val Glu His Val Leu 8645 8650 8655 Arg Pro Lys Val Asp Ala Ala PheLeu Leu Asp Glu Leu Thr Ser Thr 8660 8665 8670 Pro Gly Tyr Asp Leu AlaAla Phe Val Met Phe Ser Ser Ala Ala Ala 8675 8680 8685 Val Phe Gly GlyAla Gly Gln Gly Ala Tyr Ala Ala Ala Asn Ala Thr 8690 8695 8700 Leu AspAla Leu Ala Trp Arg Arg Arg Thr Ala Gly Leu Pro Ala Leu 8705 8710 87158720 Ser Leu Gly Trp Gly Leu Trp Ala Glu Thr Ser Gly Met Thr Gly Gly8725 8730 8735 Leu Ser Asp Thr Asp Arg Ser Arg Leu Ala Arg Ser Gly AlaThr Pro 8740 8745 8750 Met Asp Ser Glu Leu Thr Leu Ser Leu Leu Asp AlaAla Met Arg Arg 8755 8760 8765 Asp Asp Pro Ala Leu Val Pro Ile Ala LeuAsp Val Ala Ala Leu Arg 8770 8775 8780 Ala Gln Gln Arg Asp Gly Met LeuAla Pro Leu Leu Ser Gly Leu Thr 8785 8790 8795 8800 Arg Gly Ser Arg ValGly Gly Ala Pro Val Asn Gln Arg Arg Ala Ala 8805 8810 8815 Ala Gly GlyAla Gly Glu Ala Asp Thr Asp Leu Gly Gly Arg Leu Ala 8820 8825 8830 AlaMet Thr Pro Asp Asp Arg Val Ala His Leu Arg Asp Leu Val Arg 8835 88408845 Thr His Val Ala Thr Val Leu Gly His Gly Thr Pro Ser Arg Val Asp8850 8855 8860 Leu Glu Arg Ala Phe Arg Asp Thr Gly Phe Asp Ser Leu ThrAla Val 8865 8870 8875 8880 Glu Leu Arg Asn Arg Leu Asn Ala Ala Thr GlyLeu Arg Leu Pro Ala 8885 8890 8895 Thr Leu Val Phe Asp His Pro Thr ProGly Glu Leu Ala Gly His Leu 8900 8905 8910 Leu Asp Glu Leu Ala Thr AlaAla Gly Gly Ser Trp Ala Glu Gly Thr 8915 8920 8925 Gly Ser Gly Asp ThrAla Ser Ala Thr Asp Arg Gln Thr Thr Ala Ala 8930 8935 8940 Leu Ala GluLeu Asp Arg Leu Glu Gly Val Leu Ala Ser Leu Ala Pro 8945 8950 8955 8960Ala Ala Gly Gly Arg Pro Glu Leu Ala Ala Arg Leu Arg Ala Leu Ala 89658970 8975 Ala Ala Leu Gly Asp Asp Gly Asp Asp Ala Thr Asp Leu Asp GluAla 8980 8985 8990 Ser Asp Asp Asp Leu Phe Ser Phe Ile Asp Lys Glu LeuGly Asp Ser 8995 9000 9005 Asp Phe Met Ala Asn Asn Glu Asp Lys Leu ArgAsp Tyr Leu Lys Arg 9010 9015 9020 Val Thr Ala Glu Leu Gln Gln Asn ThrArg Arg Leu Arg Glu Ile Glu 9025 9030 9035 9040 Gly Arg Thr His Glu ProVal Ala Ile Val Gly Met Ala Cys Arg Leu 9045 9050 9055 Pro Gly Gly ValAla Ser Pro Glu Asp Leu Trp Gln Leu Val Ala Gly 9060 9065 9070 Asp GlyAsp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val 9075 9080 9085Glu Gly Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys 90909095 9100 Arg Ser Gly Gly Phe Leu His Asp Ala Gly Glu Phe Asp Ala AspPhe 9105 9110 9115 9120 Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met AspPro Gln Gln Arg 9125 9130 9135 Leu Ser Leu Thr Thr Ala Trp Glu Ala IleGlu Ser Ala Gly Ile Asp 9140 9145 9150 Pro Thr Ala Leu Lys Gly Ser GlyLeu Gly Val Phe Val Gly Gly Trp 9155 9160 9165 His Thr Gly Tyr Thr SerGly Gln Thr Thr Ala Val Gln Ser Pro Glu 9170 9175 9180 Leu Glu Gly HisLeu Val Ser Gly Ala Ala Leu Gly Phe Leu Ser Gly 9185 9190 9195 9200 ArgIle Ala Tyr Val Leu Gly Thr Asp Gly Pro Ala Leu Thr Val Asp 9205 92109215 Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala9220 9225 9230 Leu Arg Lys Gly Glu Cys Asp Met Ala Leu Ala Gly Gly ValThr Val 9235 9240 9245 Met Pro Asn Ala Asp Leu Phe Val Gln Phe Ser ArgGln Arg Gly Leu 9250 9255 9260 Ala Ala Asp Gly Arg Ser Lys Ala Phe AlaThr Ser Ala Asp Gly Phe 9265 9270 9275 9280 Gly Pro Ala Glu Gly Ala GlyVal Leu Leu Val Glu Arg Leu Ser Asp 9285 9290 9295 Ala Arg Arg Asn GlyHis Arg Ile Leu Ala Val Val Arg Gly Ser Ala 9300 9305 9310 Val Asn GlnAsp Gly Ala Ser Asn Gly Leu Thr Ala Pro His Gly Pro 9315 9320 9325 SerGln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Ala 9330 93359340 Pro Gly Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu9345 9350 9355 9360 Gly Asp Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr TyrGly Gln Glu 9365 9370 9375 Lys Ser Ser Glu Gln Pro Leu Arg Leu Gly AlaLeu Lys Ser Asn Ile 9380 9385 9390 Gly His Thr Gln Ala Ala Ala Gly ValAla Gly Val Ile Lys Met Val 9395 9400 9405 Gln Ala Met Arg His Gly LeuLeu Pro Lys Thr Leu His Val Asp Glu 9410 9415 9420 Pro Ser Asp Gln IleAsp Trp Ser Ala Gly Thr Val Glu Leu Leu Thr 9425 9430 9435 9440 Glu AlaVal Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu Arg Arg Ala 9445 9450 9455Ala Val Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Val Val Leu 94609465 9470 Glu Glu Ala Pro Ala Val Glu Asp Ser Pro Ala Val Glu Pro ProAla 9475 9480 9485 Gly Gly Gly Val Val Pro Trp Pro Val Ser Ala Lys ThrPro Ala Ala 9490 9495 9500 Leu Asp Ala Gln Ile Gly Gln Leu Ala Ala TyrAla Asp Gly Arg Thr 9505 9510 9515 9520 Asp Val Asp Pro Ala Val Ala AlaArg Ala Leu Val Asp Ser Arg Thr 9525 9530 9535 Ala Met Glu His Arg AlaVal Ala Val Gly Asp Ser Arg Glu Ala Leu 9540 9545 9550 Arg Asp Ala LeuArg Met Pro Glu Gly Leu Val Arg Gly Thr Ser Ser 9555 9560 9565 Asp ValGly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp 9570 9575 9580Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser Pro Glu Phe Ala Ala 95859590 9595 9600 Ser Met Ala Glu Cys Glu Thr Ala Leu Ser Arg Tyr Val AspTrp Ser 9605 9610 9615 Leu Glu Ala Val Val Arg Gln Glu Pro Gly Ala ProThr Leu Asp Arg 9620 9625 9630 Val Asp Val Val Gln Pro Val Thr Phe AlaVal Met Val Ser Leu Ala 9635 9640 9645 Lys Val Trp Gln His His Gly IleThr Pro Gln Ala Val Val Gly His 9650 9655 9660 Ser Gln Gly Glu Ile AlaAla Ala Tyr Val Ala Gly Ala Leu Thr Leu 9665 9670 9675 9680 Asp Asp AlaAla Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala 9685 9690 9695 HisLeu Ala Gly Lys Gly Gly Met Ile Ser Leu Ala Leu Asp Glu Ala 9700 97059710 Ala Val Leu Lys Arg Leu Ser Asp Phe Asp Gly Leu Ser Val Ala Ala9715 9720 9725 Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro ThrGln Ile 9730 9735 9740 Glu Glu Leu Ala Arg Thr Cys Glu Ala Asp Gly ValArg Ala Arg Ile 9745 9750 9755 9760 Ile Pro Val Asp Tyr Ala Ser His SerArg Gln Val Glu Ile Ile Glu 9765 9770 9775 Lys Glu Leu Ala Glu Val LeuAla Gly Leu Ala Pro Gln Ala Pro His 9780 9785 9790 Val Pro Phe Phe SerThr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val 9795 9800 9805 Leu Asp GlyThr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe 9810 9815 9820 AlaPro Ala Val Glu Thr Leu Ala Val Asp Gly Phe Thr His Phe Ile 9825 98309835 9840 Glu Val Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu ThrVal 9845 9850 9855 Thr Gly Leu Gly Thr Leu Arg Arg Glu Gln Gly Gly GlnGlu Arg Leu 9860 9865 9870 Val Thr Ser Leu Ala Glu Ala Trp Ala Asn GlyLeu Thr Ile Asp Trp 9875 9880 9885 Ala Pro Ile Leu Pro Thr Ala Thr GlyHis His Pro Glu Leu Pro Thr 9890 9895 9900 Tyr Ala Phe Gln Thr Glu ArgPhe Trp Leu Gln Ser Ser Ala Pro Thr 9905 9910 9915 9920 Ser Ala Ala AspAsp Trp Arg Tyr Arg Val Glu Trp Lys Pro Leu Thr 9925 9930 9935 Ala SerGly Gln Ala Asp Leu Ser Gly Arg Trp Ile Val Ala Val Gly 9940 9945 9950Ser Glu Pro Glu Ala Glu Leu Leu Gly Ala Leu Lys Ala Ala Gly Ala 99559960 9965 Glu Val Asp Val Leu Glu Ala Gly Ala Asp Asp Asp Arg Glu AlaLeu 9970 9975 9980 Ala Ala Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly PheThr Gly Val 9985 9990 9995 10000 Val Ser Leu Leu Asp Asp Leu Val Pro GlnVal Ala Trp Val Gln Ala 10005 10010 10015 Leu Gly Asp Ala Gly Ile LysAla Pro Leu Trp Ser Val Thr Gln Gly 10020 10025 10030 Ala Val Ser ValGly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala 10035 10040 10045 MetLeu Trp Gly Leu Gly Arg Val Val Ala Leu Glu His Pro Glu Arg 10050 1005510060 Trp Ala Gly Leu Val Asp Leu Pro Ala Gln Pro Asp Ala Ala Ala Leu10065 10070 10075 10080 Ala His Leu Val Thr Ala Leu Ser Gly Ala Thr GlyGlu Asp Gln Ile 10085 10090 10095 Ala Ile Arg Thr Thr Gly Leu His AlaArg Arg Leu Ala Arg Ala Pro 10100 10105 10110 Leu His Gly Arg Arg ProThr Arg Asp Trp Gln Pro His Gly Thr Val 10115 10120 10125 Leu Ile ThrGly Gly Thr Gly Ala Leu Gly Ser His Ala Ala Arg Trp 10130 10135 10140Met Ala His His Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly 1014510150 10155 10160 Glu Gln Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu LeuThr Ala Ser 10165 10170 10175 Gly Ala Arg Val Thr Ile Ala Ala Cys AspVal Ala Asp Pro His Ala 10180 10185 10190 Met Arg Thr Leu Leu Asp AlaIle Pro Ala Glu Thr Pro Leu Thr Ala 10195 10200 10205 Val Val His ThrAla Gly Ala Pro Gly Gly Asp Pro Leu Asp Val Thr 10210 10215 10220 GlyPro Glu Asp Ile Ala Arg Ile Leu Gly Ala Lys Thr Ser Gly Ala 10225 1023010235 10240 Glu Val Leu Asp Asp Leu Leu Arg Gly Thr Pro Leu Asp Ala PheVal 10245 10250 10255 Leu Tyr Ser Ser Asn Ala Gly Val Trp Gly Ser GlySer Gln Gly Val 10260 10265 10270 Tyr Ala Ala Ala Asn Ala His Leu AspAla Leu Ala Ala Arg Arg Arg 10275 10280 10285 Ala Arg Gly Glu Thr AlaThr Ser Val Ala Trp Gly Leu Trp Ala Gly 10290 10295 10300 Asp Gly MetGly Arg Gly Ala Asp Asp Ala Tyr Trp Gln Arg Arg Gly 10305 10310 1031510320 Ile Arg Pro Met Ser Pro Asp Arg Ala Leu Asp Glu Leu Ala Lys Ala10325 10330 10335 Leu Ser His Asp Glu Thr Phe Val Ala Val Ala Asp ValAsp Trp Glu 10340 10345 10350 Arg Phe Ala Pro Ala Phe Thr Val Ser ArgPro Ser Leu Leu Leu Asp 10355 10360 10365 Gly Val Pro Glu Ala Arg GlnAla Leu Ala Ala Pro Val Gly Ala Pro 10370 10375 10380 Ala Pro Gly AspAla Ala Val Ala Pro Thr Gly Gln Ser Ser Ala Leu 10385 10390 10395 10400Ala Ala Ile Thr Ala Leu Pro Glu Pro Glu Arg Arg Pro Ala Leu Leu 1040510410 10415 Thr Leu Val Arg Thr His Ala Ala Ala Val Leu Gly His Ser SerPro 10420 10425 10430 Asp Arg Val Ala Pro Gly Arg Ala Phe Thr Glu LeuGly Phe Asp Ser 10435 10440 10445 Leu Thr Ala Val Gln Leu Arg Asn GlnLeu Ser Thr Val Val Gly Asn 10450 10455 10460 Arg Leu Pro Ala Thr ThrVal Phe Asp His Pro Thr Pro Ala Ala Leu 10465 10470 10475 10480 Ala AlaHis Leu His Glu Ala Tyr Leu Ala Pro Ala Glu Pro Ala Pro 10485 1049010495 Thr Asp Trp Glu Gly Arg Val Arg Arg Ala Leu Ala Glu Leu Pro Leu10500 10505 10510 Asp Arg Leu Arg Asp Ala Gly Val Leu Asp Thr Val LeuArg Leu Thr 10515 10520 10525 Gly Ile Glu Pro Glu Pro Gly Ser Gly GlySer Asp Gly Gly Ala Ala 10530 10535 10540 Asp Pro Gly Ala Glu Pro GluAla Ser Ile Asp Asp Leu Asp Ala Glu 10545 10550 10555 10560 Ala Leu IleArg Met Ala Leu Gly Pro Arg Asn Thr Met Thr Ser Ser 10565 10570 10575Asn Glu Gln Leu Val Asp Ala Leu Arg Ala Ser Leu Lys Glu Asn Glu 1058010585 10590 Glu Leu Arg Lys Glu Ser Arg Arg Arg Ala Asp Arg Arg Gln GluPro 10595 10600 10605 Met Ala Ile Val Gly Met Ser Cys Arg Phe Ala GlyGly Ile Arg Ser 10610 10615 10620 Pro Glu Asp Leu Trp Asp Ala Val AlaAla Gly Lys Asp Leu Val Ser 10625 10630 10635 10640 Glu Val Pro Glu GluArg Gly Trp Asp Ile Asp Ser Leu Tyr Asp Pro 10645 10650 10655 Val ProGly Arg Lys Gly Thr Thr Tyr Val Arg Asn Ala Ala Phe Leu 10660 1066510670 Asp Asp Ala Ala Gly Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg10675 10680 10685 Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Gln Leu LeuGlu Ala Ser 10690 10695 10700 Trp Glu Val Phe Glu Arg Ala Gly Ile AspPro Ala Ser Val Arg Gly 10705 10710 10715 10720 Thr Asp Val Gly Val TyrVal Gly Cys Gly Tyr Gln Asp Tyr Ala Pro 10725 10730 10735 Asp Ile ArgVal Ala Pro Glu Gly Thr Gly Gly Tyr Val Val Thr Gly 10740 10745 10750Asn Ser Ser Ala Val Ala Ser Gly Arg Ile Ala Tyr Ser Leu Gly Leu 1075510760 10765 Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser LeuVal 10770 10775 10780 Ala Leu His Leu Ala Leu Lys Gly Leu Arg Asn GlyAsp Cys Ser Thr 10785 10790 10795 10800 Ala Leu Val Gly Gly Val Ala ValLeu Ala Thr Pro Gly Ala Phe Ile 10805 10810 10815 Glu Phe Ser Ser GlnGln Ala Met Ala Ala Asp Gly Arg Thr Lys Gly 10820 10825 10830 Phe AlaSer Ala Ala Asp Gly Leu Ala Trp Gly Glu Gly Val Ala Val 10835 1084010845 Leu Leu Leu Glu Arg Leu Ser Asp Ala Arg Arg Lys Gly His Arg Val10850 10855 10860 Leu Ala Val Val Arg Gly Ser Ala Ile Asn Gln Asp GlyAla Ser Asn 10865 10870 10875 10880 Gly Leu Thr Ala Pro His Gly Pro SerGln Gln His Leu Ile Arg Gln 10885 10890 10895 Ala Leu Ala Asp Ala ArgLeu Thr Ser Ser Asp Val Asp Val Val Glu 10900 10905 10910 Gly His GlyThr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala 10915 10920 10925Leu Leu Ala Thr Tyr Gly Gln Gly Arg Ala Pro Gly Gln Pro Leu Arg 1093010935 10940 Leu Gly Thr Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala SerGly 10945 10950 10955 10960 Val Ala Gly Val Ile Lys Met Val Gln Ala LeuArg His Gly Val Leu 10965 10970 10975 Pro Lys Thr Leu His Val Asp GluPro Thr Asp Gln Val Asp Trp Ser 10980 10985 10990 Ala Gly Ser Val GluLeu Leu Thr Glu Ala Val Asp Trp Pro Glu Arg 10995 11000 11005 Pro GlyArg Leu Arg Arg Ala Gly Val Ser Ala Phe Gly Val Gly Gly 11010 1101511020 Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Ala Val Glu Glu Ser11025 11030 11035 11040 Pro Ala Val Glu Pro Pro Ala Gly Gly Gly Val ValPro Trp Pro Val 11045 11050 11055 Ser Ala Lys Thr Ser Ala Ala Leu AspAla Gln Ile Gly Gln Leu Ala 11060 11065 11070 Ala Tyr Ala Glu Asp ArgThr Asp Val Asp Pro Ala Val Ala Ala Arg 11075 11080 11085 Ala Leu ValAsp Ser Arg Thr Ala Met Glu His Arg Ala Val Ala Val 11090 11095 11100Gly Asp Ser Arg Glu Ala Leu Arg Asp Ala Leu Arg Met Pro Glu Gly 1110511110 11115 11120 Leu Val Arg Gly Thr Val Thr Asp Pro Gly Arg Val AlaPhe Val Phe 11125 11130 11135 Pro Gly Gln Gly Thr Gln Trp Ala Gly MetGly Ala Glu Leu Leu Asp 11140 11145 11150 Ser Ser Pro Glu Phe Ala AlaAla Met Ala Glu Cys Glu Thr Ala Leu 11155 11160 11165 Ser Pro Tyr ValAsp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro 11170 11175 11180 SerAla Pro Thr Leu Asp Arg Val Asp Val Val Gln Pro Val Thr Phe 11185 1119011195 11200 Ala Val Met Val Ser Leu Ala Lys Val Trp Gln His His Gly IleThr 11205 11210 11215 Pro Glu Ala Val Ile Gly His Ser Gln Gly Glu IleAla Ala Ala Tyr 11220 11225 11230 Val Ala Gly Ala Leu Thr Leu Asp AspAla Ala Arg Val Val Thr Leu 11235 11240 11245 Arg Ser Lys Ser Ile AlaAla His Leu Ala Gly Lys Gly Gly Met Ile 11250 11255 11260 Ser Leu AlaLeu Ser Glu Glu Ala Thr Arg Gln Arg Ile Glu Asn Leu 11265 11270 1127511280 His Gly Leu Ser Ile Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val11285 11290 11295 Ser Gly Asp Pro Thr Gln Ile Gln Glu Leu Ala Gln AlaCys Glu Ala 11300 11305 11310 Asp Gly Ile Arg Ala Arg Ile Ile Pro ValAsp Tyr Ala Ser His Ser 11315 11320 11325 Ala His Val Glu Thr Ile GluAsn Glu Leu Ala Asp Val Leu Ala Gly 11330 11335 11340 Leu Ser Pro GlnThr Pro Gln Val Pro Phe Phe Ser Thr Leu Glu Gly 11345 11350 11355 11360Thr Trp Ile Thr Glu Pro Ala Leu Asp Gly Gly Tyr Trp Tyr Arg Asn 1136511370 11375 Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu AlaThr 11380 11385 11390 Asp Glu Gly Phe Thr His Phe Ile Glu Val Ser AlaHis Pro Val Leu 11395 11400 11405 Thr Met Thr Leu Pro Asp Lys Val ThrGly Leu Ala Thr Leu Arg Arg 11410 11415 11420 Glu Asp Gly Gly Gln HisArg Leu Thr Thr Ser Leu Ala Glu Ala Trp 11425 11430 11435 11440 Ala AsnGly Leu Ala Leu Asp Trp Ala Ser Leu Leu Pro Ala Thr Gly 11445 1145011455 Ala Leu Ser Pro Ala Val Pro Asp Leu Pro Thr Tyr Ala Phe Gln His11460 11465 11470 Arg Ser Tyr Trp Ile Ser Pro Ala Gly Pro Gly Glu AlaPro Ala His 11475 11480 11485 Thr Ala Ser Gly Arg Glu Ala Val Ala GluThr Gly Leu Ala Trp Gly 11490 11495 11500 Pro Gly Ala Glu Asp Leu AspGlu Glu Gly Arg Arg Ser Ala Val Leu 11505 11510 11515 11520 Ala Met ValMet Arg Gln Ala Ala Ser Val Leu Arg Cys Asp Ser Pro 11525 11530 11535Glu Glu Val Pro Val Asp Arg Pro Leu Arg Glu Ile Gly Phe Asp Ser 1154011545 11550 Leu Thr Ala Val Asp Phe Arg Asn Arg Val Asn Arg Leu Thr GlyLeu 11555 11560 11565 Gln Leu Pro Pro Thr Val Val Phe Gln His Pro ThrPro Val Ala Leu 11570 11575 11580 Ala Glu Arg Ile Ser Asp Glu Leu AlaGlu Arg Asn Trp Ala Val Ala 11585 11590 11595 11600 Glu Pro Ser Asp HisGlu Gln Ala Glu Glu Glu Lys Ala Ala Ala Pro 11605 11610 11615 Ala GlyAla Arg Ser Gly Ala Asp Thr Gly Ala Gly Ala Gly Met Phe 11620 1162511630 Arg Ala Leu Phe Arg Gln Ala Val Glu Asp Asp Arg Tyr Gly Glu Phe11635 11640 11645 Leu Asp Val Leu Ala Glu Ala Ser Ala Phe Arg Pro GlnPhe Ala Ser 11650 11655 11660 Pro Glu Ala Cys Ser Glu Arg Leu Asp ProVal Leu Leu Ala Gly Gly 11665 11670 11675 11680 Pro Thr Asp Arg Ala GluGly Arg Ala Val Leu Val Gly Cys Thr Gly 11685 11690 11695 Thr Ala AlaAsn Gly Gly Pro His Glu Phe Leu Arg Leu Ser Thr Ser 11700 11705 11710Phe Gln Glu Glu Arg Asp Phe Leu Ala Val Pro Leu Pro Gly Tyr Gly 1171511720 11725 Thr Gly Thr Gly Thr Gly Thr Ala Leu Leu Pro Ala Asp Leu AspThr 11730 11735 11740 Ala Leu Asp Ala Gln Ala Arg Ala Ile Leu Arg AlaAla Gly Asp Ala 11745 11750 11755 11760 Pro Val Val Leu Leu Gly His SerGly Gly Ala Leu Leu Ala His Glu 11765 11770 11775 Leu Ala Phe Arg LeuGlu Arg Ala His Gly Ala Pro Pro Ala Gly Ile 11780 11785 11790 Val LeuVal Asp Pro Tyr Pro Pro Gly His Gln Glu Pro Ile Glu Val 11795 1180011805 Trp Ser Arg Gln Leu Gly Glu Gly Leu Phe Ala Gly Glu Leu Glu Pro11810 11815 11820 Met Ser Asp Ala Arg Leu Leu Ala Met Gly Arg Tyr AlaArg Phe Leu 11825 11830 11835 11840 Ala Gly Pro Arg Pro Gly Arg Ser SerAla Pro Val Leu Leu Val Arg 11845 11850 11855 Ala Ser Glu Pro Leu GlyAsp Trp Gln Glu Glu Arg Gly Asp Trp Arg 11860 11865 11870 Ala His TrpAsp Leu Pro His Thr Val Ala Asp Val Pro Gly Asp His 11875 11880 11885Phe Thr Met Met Arg Asp His Ala Pro Ala Val Ala Glu Ala Val Leu 1189011895 11900 Ser Trp Leu Asp Ala Ile Glu Gly Ile Glu Gly Ala Gly Lys MetThr 11905 11910 11915 11920 Asp Arg Pro Leu Asn Val Asp Ser Gly Leu TrpIle Arg Arg Phe His 11925 11930 11935 Pro Ala Pro Asn Ser Ala Val ArgLeu Val Cys Leu Pro His Ala Gly 11940 11945 11950 Gly Ser Ala Ser TyrPhe Phe Arg Phe Ser Glu Glu Leu His Pro Ser 11955 11960 11965 Val GluAla Leu Ser Val Gln Tyr Pro Gly Arg Gln Asp Arg Arg Ala 11970 1197511980 Glu Pro Cys Leu Glu Ser Val Glu Glu Leu Ala Glu His Val Val Ala11985 11990 11995 12000 Ala Thr Glu Pro Trp Trp Gln Glu Gly Arg Leu AlaPhe Phe Gly His 12005 12010 12015 Ser Leu Gly Ala Ser Val Ala Phe GluThr Ala Arg Ile Leu Glu Gln 12020 12025 12030 Arg His Gly Val Arg ProGlu Gly Leu Tyr Val Ser Gly Arg Arg Ala 12035 12040 12045 Pro Ser LeuAla Pro Asp Arg Leu Val His Gln Leu Asp Asp Arg Ala 12050 12055 12060Phe Leu Ala Glu Ile Arg Arg Leu Ser Gly Thr Asp Glu Arg Phe Leu 1206512070 12075 12080 Gln Asp Asp Glu Leu Leu Arg Leu Val Leu Pro Ala LeuArg Ser Asp 12085 12090 12095 Tyr Lys Ala Ala Glu Thr Tyr Leu His ArgPro Ser Ala Lys Leu Thr 12100 12105 12110 Cys Pro Val Met Ala Leu AlaGly Asp Arg Asp Pro Lys Ala Pro Leu 12115 12120 12125 Asn Glu Val AlaGlu Trp Arg Arg His Thr Ser Gly Pro Phe Cys Leu 12130 12135 12140 ArgAla Tyr Ser Gly Gly His Phe Tyr Leu Asn Asp Gln Trp His Glu 12145 1215012155 12160 Ile Cys Asn Asp Ile Ser Asp His Leu Leu Val Thr Arg Gly AlaPro 12165 12170 12175 Asp Ala Arg Val Val Gln Pro Pro Thr Ser Leu IleGlu Gly Ala Ala 12180 12185 12190 Lys Arg Trp Gln Asn Pro Arg 12195 71248 DNA Streptomyces venezuelae 7 gtgaaaagcg ccttatccga cctcgcattcttcggcggcc ccgccgcttt cgaccagccg 60 ctcctcgtgg ggcggcccaa ccgcatcgaccgcgccaggc tgtacgagcg gctcgaccgg 120 gccctcgaca gccagtggct gtccaacggcggcccgctcg tccgcgagtt cgaggagcgc 180 gtcgccgggc tcgccggggt ccggcatgccgtggccacct gcaacgccac ggccgggctc 240 cagctcctcg cgcacgccgc cggcctcaccggcgaagtga tcatgccgtc gatgacgttc 300 gccgccaccc cgcacgcact gcgctggatcggcctcaccc cggtcttcgc cgacatcgac 360 ccggacaccg gcaacctcga cccggaccaggtggccgccg cggtcacacc ccgcacctcg 420 gccgtcgtcg gcgtccacct ctggggccgcccctgcgccg ccgaccagct gcggaaggtc 480 gccgacgagc acggcctgcg gctgtacttcgacgccgcgc acgccctcgg ctgcgcggtc 540 gacggccggc ccgccggcag cctcggcgacgccgaggtct tcagcttcca cgccaccaag 600 gccgtcaacg ccttcgaggg cggcgccgtcgtcaccgacg acgccgacct cgccgcccgg 660 atccgcgccc tccacaactt cggcttcgacctgcccggcg gcagccccgc cggcgggacc 720 aacgccaaga tgagcgaggc cgccgccgccatgggcctca cctccctcga cgcgtttccc 780 gaggtcatcg accggaaccg gcgcaaccacgccgcctacc gcgagcacct cgcggacctc 840 cccggcgtcc tcgtcgccga ccacgaccgccacggcctca acaaccacca gtacgtgatc 900 gtcgagatcg acgaggccac caccggcatccaccgcgacc tcgtcatgga ggtcctgaag 960 gccgaaggcg tgcacacccg cgcctacttctcgccgggct gccacgagct ggagccgtac 1020 cgcgggcagc cgcacgcccc gctgccgcacaccgaacgcc tcgccgcgcg cgtgctgtcc 1080 ctgccgaccg gcaccgccat cggcgacgacgacatccgcc gggtcgccga cctgctgcgt 1140 ctctgcgcga cccgcggccg cgaactgaccgcgcgccacc gcgacacggc ccccgccccg 1200 ctcgcggccc cccagacatc cacgcccacgattggacgct cccgatga 1248 8 415 PRT Streptomyces venezuelae 8 Met Lys SerAla Leu Ser Asp Leu Ala Phe Phe Gly Gly Pro Ala Ala 1 5 10 15 Phe AspGln Pro Leu Leu Val Gly Arg Pro Asn Arg Ile Asp Arg Ala 20 25 30 Arg LeuTyr Glu Arg Leu Asp Arg Ala Leu Asp Ser Gln Trp Leu Ser 35 40 45 Asn GlyGly Pro Leu Val Arg Glu Phe Glu Glu Arg Val Ala Gly Leu 50 55 60 Ala GlyVal Arg His Ala Val Ala Thr Cys Asn Ala Thr Ala Gly Leu 65 70 75 80 GlnLeu Leu Ala His Ala Ala Gly Leu Thr Gly Glu Val Ile Met Pro 85 90 95 SerMet Thr Phe Ala Ala Thr Pro His Ala Leu Arg Trp Ile Gly Leu 100 105 110Thr Pro Val Phe Ala Asp Ile Asp Pro Asp Thr Gly Asn Leu Asp Pro 115 120125 Asp Gln Val Ala Ala Ala Val Thr Pro Arg Thr Ser Ala Val Val Gly 130135 140 Val His Leu Trp Gly Arg Pro Cys Ala Ala Asp Gln Leu Arg Lys Val145 150 155 160 Ala Asp Glu His Gly Leu Arg Leu Tyr Phe Asp Ala Ala HisAla Leu 165 170 175 Gly Cys Ala Val Asp Gly Arg Pro Ala Gly Ser Leu GlyAsp Ala Glu 180 185 190 Val Phe Ser Phe His Ala Thr Lys Ala Val Asn AlaPhe Glu Gly Gly 195 200 205 Ala Val Val Thr Asp Asp Ala Asp Leu Ala AlaArg Ile Arg Ala Leu 210 215 220 His Asn Phe Gly Phe Asp Leu Pro Gly GlySer Pro Ala Gly Gly Thr 225 230 235 240 Asn Ala Lys Met Ser Glu Ala AlaAla Ala Met Gly Leu Thr Ser Leu 245 250 255 Asp Ala Phe Pro Glu Val IleAsp Arg Asn Arg Arg Asn His Ala Ala 260 265 270 Tyr Arg Glu His Leu AlaAsp Leu Pro Gly Val Leu Val Ala Asp His 275 280 285 Asp Arg His Gly LeuAsn Asn His Gln Tyr Val Ile Val Glu Ile Asp 290 295 300 Glu Ala Thr ThrGly Ile His Arg Asp Leu Val Met Glu Val Leu Lys 305 310 315 320 Ala GluGly Val His Thr Arg Ala Tyr Phe Ser Pro Gly Cys His Glu 325 330 335 LeuGlu Pro Tyr Arg Gly Gln Pro His Ala Pro Leu Pro His Thr Glu 340 345 350Arg Leu Ala Ala Arg Val Leu Ser Leu Pro Thr Gly Thr Ala Ile Gly 355 360365 Asp Asp Asp Ile Arg Arg Val Ala Asp Leu Leu Arg Leu Cys Ala Thr 370375 380 Arg Gly Arg Glu Leu Thr Ala Arg His Arg Asp Thr Ala Pro Ala Pro385 390 395 400 Leu Ala Ala Pro Gln Thr Ser Thr Pro Thr Ile Gly Arg SerArg 405 410 415 9 1458 DNA Streptomyces venezuelae 9 atgaccgcccccgccctttc cgccaccgcc ccggccgaac gctgcgcgca ccccggagcc 60 gatctgggggcggcggtcca cgccgtcggc cagaccctcg ccgccggcgg cctcgtgccg 120 cccgacgaggccggaacgac cgcccgccac ctcgtccggc tcgccgtgcg ctacggcaac 180 agccccttcaccccgctgga ggaggcccgc cacgacctgg gcgtcgaccg ggacgccttc 240 cggcgcctcctcgccctgtt cgggcaggtc ccggagctcc gcaccgcggt cgagaccggc 300 cccgccggggcgtactggaa gaacaccctg ctcccgctcg aacagcgcgg cgtcttcgac 360 gcggcgctcgccaggaagcc cgtcttcccg tacagcgtcg gcctctaccc cggcccgacc 420 tgcatgttccgctgccactt ctgcgtccgt gtgaccggcg cccgctacga cccgtccgcc 480 ctcgacgccggcaacgccat gttccggtcg gtcatcgacg agatacccgc gggcaacccc 540 tcggcgatgtacttctccgg cggcctggag ccgctcacca accccggcct cgggagcctg 600 gccgcgcacgccaccgacca cggcctgcgg cccaccgtct acacgaactc cttcgcgctc 660 accgagcgcaccctggagcg ccagcccggc ctctggggcc tgcacgccat ccgcacctcg 720 ctctacggcctcaacgacga ggagtacgag cagaccaccg gcaagaaggc cgccttccgc 780 cgcgtccgcgagaacctgcg ccgcttccag cagctgcgcg ccgagcgcga gtcgccgatc 840 aacctcggcttcgcctacat cgtgctcccg ggccgtgcct cccgcctgct cgacctggtc 900 gacttcatcgccgacctcaa cgacgccggg cagggcagga cgatcgactt cgtcaacatt 960 cgcgaggactacagcggccg tgacgacggc aagctgccgc aggaggagcg ggccgagctc 1020 caggaggccctcaacgcctt cgaggagcgg gtccgcgagc gcacccccgg actccacatc 1080 gactacggctacgccctgaa cagcctgcgc accggggccg acgccgaact gctgcggatc 1140 aagcccgccaccatgcggcc caccgcgcac ccgcaggtcg cggtgcaggt cgatctcctc 1200 ggcgacgtgtacctgtaccg cgaggccggc ttccccgacc tggacggcgc gacccgctac 1260 atcgcgggccgcgtgacccc cgacacctcc ctcaccgagg tcgtcaggga cttcgtcgag 1320 cgcggcggcgaggtggcggc cgtcgacggc gacgagtact tcatggacgg cttcgatcag 1380 gtcgtcaccgcccgcctgaa ccagctggag cgcgacgccg cggacggctg ggaggaggcc 1440 cgcggcttcctgcgctga 1458 10 485 PRT Streptomyces venezuelae 10 Met Thr Ala Pro AlaLeu Ser Ala Thr Ala Pro Ala Glu Arg Cys Ala 1 5 10 15 His Pro Gly AlaAsp Leu Gly Ala Ala Val His Ala Val Gly Gln Thr 20 25 30 Leu Ala Ala GlyGly Leu Val Pro Pro Asp Glu Ala Gly Thr Thr Ala 35 40 45 Arg His Leu ValArg Leu Ala Val Arg Tyr Gly Asn Ser Pro Phe Thr 50 55 60 Pro Leu Glu GluAla Arg His Asp Leu Gly Val Asp Arg Asp Ala Phe 65 70 75 80 Arg Arg LeuLeu Ala Leu Phe Gly Gln Val Pro Glu Leu Arg Thr Ala 85 90 95 Val Glu ThrGly Pro Ala Gly Ala Tyr Trp Lys Asn Thr Leu Leu Pro 100 105 110 Leu GluGln Arg Gly Val Phe Asp Ala Ala Leu Ala Arg Lys Pro Val 115 120 125 PhePro Tyr Ser Val Gly Leu Tyr Pro Gly Pro Thr Cys Met Phe Arg 130 135 140Cys His Phe Cys Val Arg Val Thr Gly Ala Arg Tyr Asp Pro Ser Ala 145 150155 160 Leu Asp Ala Gly Asn Ala Met Phe Arg Ser Val Ile Asp Glu Ile Pro165 170 175 Ala Gly Asn Pro Ser Ala Met Tyr Phe Ser Gly Gly Leu Glu ProLeu 180 185 190 Thr Asn Pro Gly Leu Gly Ser Leu Ala Ala His Ala Thr AspHis Gly 195 200 205 Leu Arg Pro Thr Val Tyr Thr Asn Ser Phe Ala Leu ThrGlu Arg Thr 210 215 220 Leu Glu Arg Gln Pro Gly Leu Trp Gly Leu His AlaIle Arg Thr Ser 225 230 235 240 Leu Tyr Gly Leu Asn Asp Glu Glu Tyr GluGln Thr Thr Gly Lys Lys 245 250 255 Ala Ala Phe Arg Arg Val Arg Glu AsnLeu Arg Arg Phe Gln Gln Leu 260 265 270 Arg Ala Glu Arg Glu Ser Pro IleAsn Leu Gly Phe Ala Tyr Ile Val 275 280 285 Leu Pro Gly Arg Ala Ser ArgLeu Leu Asp Leu Val Asp Phe Ile Ala 290 295 300 Asp Leu Asn Asp Ala GlyGln Gly Arg Thr Ile Asp Phe Val Asn Ile 305 310 315 320 Arg Glu Asp TyrSer Gly Arg Asp Asp Gly Lys Leu Pro Gln Glu Glu 325 330 335 Arg Ala GluLeu Gln Glu Ala Leu Asn Ala Phe Glu Glu Arg Val Arg 340 345 350 Glu ArgThr Pro Gly Leu His Ile Asp Tyr Gly Tyr Ala Leu Asn Ser 355 360 365 LeuArg Thr Gly Ala Asp Ala Glu Leu Leu Arg Ile Lys Pro Ala Thr 370 375 380Met Arg Pro Thr Ala His Pro Gln Val Ala Val Gln Val Asp Leu Leu 385 390395 400 Gly Asp Val Tyr Leu Tyr Arg Glu Ala Gly Phe Pro Asp Leu Asp Gly405 410 415 Ala Thr Arg Tyr Ile Ala Gly Arg Val Thr Pro Asp Thr Ser LeuThr 420 425 430 Glu Val Val Arg Asp Phe Val Glu Arg Gly Gly Glu Val AlaAla Val 435 440 445 Asp Gly Asp Glu Tyr Phe Met Asp Gly Phe Asp Gln ValVal Thr Ala 450 455 460 Arg Leu Asn Gln Leu Glu Arg Asp Ala Ala Asp GlyTrp Glu Glu Ala 465 470 475 480 Arg Gly Phe Leu Arg 485 11 879 DNAStreptomyces venezuelae 11 atgaagggaa tagtcctggc cggcgggagc ggaactcggctgcatccggc gacctcggtc 60 atttcgaagc agattcttcc ggtctacaac aaaccgatgatctactatcc gctgtcggtt 120 ctcatgctcg gcggtattcg cgagattcaa atcatctcgaccccccagca catcgaactc 180 ttccagtcgc ttctcggaaa cggcaggcac ctgggaatagaactcgacta tgcggtccag 240 aaagagcccg caggaatcgc ggacgcactt ctcgtcggagccgagcacat cggcgacgac 300 acctgcgccc tgatcctggg cgacaacatc ttccacgggcccggcctcta cacgctcctg 360 cgggacagca tcgcgcgcct cgacggctgc gtgctcttcggctacccggt caaggacccc 420 gagcggtacg gcgtcgccga ggtggacgcg acgggccggctgaccgacct cgtcgagaag 480 cccgtcaagc cgcgctccaa cctcgccgtc accggcctctacctctacga caacgacgtc 540 gtcgacatcg ccaagaacat ccggccctcg ccgcgcggcgagctggagat caccgacgtc 600 aaccgcgtct acctggagcg gggccgggcc gaactcgtcaacctgggccg cggcttcgcc 660 tggctggaca ccggcaccca cgactcgctc ctgcgggccgcccagtacgt ccaggtcctg 720 gaggagcggc agggcgtctg gatcgcgggc cttgaggagatcgccttccg catgggcttc 780 atcgacgccg aggcctgtca cggcctggga gaaggcctctcccgcaccga gtacggcagc 840 tatctgatgg agatcgccgg ccgcgaggga gccccgtga 87912 292 PRT Streptomyces venezuelae 12 Met Lys Gly Ile Val Leu Ala GlyGly Ser Gly Thr Arg Leu His Pro 1 5 10 15 Ala Thr Ser Val Ile Ser LysGln Ile Leu Pro Val Tyr Asn Lys Pro 20 25 30 Met Ile Tyr Tyr Pro Leu SerVal Leu Met Leu Gly Gly Ile Arg Glu 35 40 45 Ile Gln Ile Ile Ser Thr ProGln His Ile Glu Leu Phe Gln Ser Leu 50 55 60 Leu Gly Asn Gly Arg His LeuGly Ile Glu Leu Asp Tyr Ala Val Gln 65 70 75 80 Lys Glu Pro Ala Gly IleAla Asp Ala Leu Leu Val Gly Ala Glu His 85 90 95 Ile Gly Asp Asp Thr CysAla Leu Ile Leu Gly Asp Asn Ile Phe His 100 105 110 Gly Pro Gly Leu TyrThr Leu Leu Arg Asp Ser Ile Ala Arg Leu Asp 115 120 125 Gly Cys Val LeuPhe Gly Tyr Pro Val Lys Asp Pro Glu Arg Tyr Gly 130 135 140 Val Ala GluVal Asp Ala Thr Gly Arg Leu Thr Asp Leu Val Glu Lys 145 150 155 160 ProVal Lys Pro Arg Ser Asn Leu Ala Val Thr Gly Leu Tyr Leu Tyr 165 170 175Asp Asn Asp Val Val Asp Ile Ala Lys Asn Ile Arg Pro Ser Pro Arg 180 185190 Gly Glu Leu Glu Ile Thr Asp Val Asn Arg Val Tyr Leu Glu Arg Gly 195200 205 Arg Ala Glu Leu Val Asn Leu Gly Arg Gly Phe Ala Trp Leu Asp Thr210 215 220 Gly Thr His Asp Ser Leu Leu Arg Ala Ala Gln Tyr Val Gln ValLeu 225 230 235 240 Glu Glu Arg Gln Gly Val Trp Ile Ala Gly Leu Glu GluIle Ala Phe 245 250 255 Arg Met Gly Phe Ile Asp Ala Glu Ala Cys His GlyLeu Gly Glu Gly 260 265 270 Leu Ser Arg Thr Glu Tyr Gly Ser Tyr Leu MetGlu Ile Ala Gly Arg 275 280 285 Glu Gly Ala Pro 290 13 1014 DNAStreptomyces venezuelae 13 gtgcggcttc tggtgaccgg aggtgcgggc ttcatcggctcgcacttcgt gcggcagctc 60 ctcgccgggg cgtaccccga cgtgcccgcc gatgaggtgatcgtcctgga cagcctcacc 120 tacgcgggca accgcgccaa cctcgccccg gtggacgcggacccgcgact gcgcttcgtc 180 cacggcgaca tccgcgacgc cggcctcctc gcccgggaactgcgcggcgt ggacgccatc 240 gtccacttcg cggccgagag ccacgtggac cgctccatcgcgggcgcgtc cgtgttcacc 300 gagaccaacg tgcagggcac gcagacgctg ctccagtgcgccgtcgacgc cggcgtcggc 360 cgggtcgtgc acgtctccac cgacgaggtg tacgggtcgatcgactccgg ctcctggacc 420 gagagcagcc cgctggagcc caactcgccc tacgcggcgtccaaggccgg ctccgacctc 480 gttgcccgcg cctaccaccg gacgtacggc ctcgacgtacggatcacccg ctgctgcaac 540 aactacgggc cgtaccagca ccccgagaag ctcatccccctcttcgtgac gaacctcctc 600 gacggcggga cgctcccgct gtacggcgac ggcgcgaacgtccgcgagtg ggtgcacacc 660 gacgaccact gccggggcat cgcgctcgtc ctcgcgggcggccgggccgg cgagatctac 720 cacatcggcg gcggcctgga gctgaccaac cgcgaactcaccggcatcct cctggactcg 780 ctcggcgccg actggtcctc ggtccggaag gtcgccgaccgcaagggcca cgacctgcgc 840 tactccctcg acggcggcga gatcgagcgc gagctcggctaccgcccgca ggtctccttc 900 gcggacggcc tcgcgcggac cgtccgctgg taccgggagaaccgcggctg gtgggagccg 960 ctcaaggcga ccgccccgca gctgcccgcc accgccgtggaggtgtccgc gtga 1014 14 337 PRT Streptomyces venezuelae 14 Met Arg LeuLeu Val Thr Gly Gly Ala Gly Phe Ile Gly Ser His Phe 1 5 10 15 Val ArgGln Leu Leu Ala Gly Ala Tyr Pro Asp Val Pro Ala Asp Glu 20 25 30 Val IleVal Leu Asp Ser Leu Thr Tyr Ala Gly Asn Arg Ala Asn Leu 35 40 45 Ala ProVal Asp Ala Asp Pro Arg Leu Arg Phe Val His Gly Asp Ile 50 55 60 Arg AspAla Gly Leu Leu Ala Arg Glu Leu Arg Gly Val Asp Ala Ile 65 70 75 80 ValHis Phe Ala Ala Glu Ser His Val Asp Arg Ser Ile Ala Gly Ala 85 90 95 SerVal Phe Thr Glu Thr Asn Val Gln Gly Thr Gln Thr Leu Leu Gln 100 105 110Cys Ala Val Asp Ala Gly Val Gly Arg Val Val His Val Ser Thr Asp 115 120125 Glu Val Tyr Gly Ser Ile Asp Ser Gly Ser Trp Thr Glu Ser Ser Pro 130135 140 Leu Glu Pro Asn Ser Pro Tyr Ala Ala Ser Lys Ala Gly Ser Asp Leu145 150 155 160 Val Ala Arg Ala Tyr His Arg Thr Tyr Gly Leu Asp Val ArgIle Thr 165 170 175 Arg Cys Cys Asn Asn Tyr Gly Pro Tyr Gln His Pro GluLys Leu Ile 180 185 190 Pro Leu Phe Val Thr Asn Leu Leu Asp Gly Gly ThrLeu Pro Leu Tyr 195 200 205 Gly Asp Gly Ala Asn Val Arg Glu Trp Val HisThr Asp Asp His Cys 210 215 220 Arg Gly Ile Ala Leu Val Leu Ala Gly GlyArg Ala Gly Glu Ile Tyr 225 230 235 240 His Ile Gly Gly Gly Leu Glu LeuThr Asn Arg Glu Leu Thr Gly Ile 245 250 255 Leu Leu Asp Ser Leu Gly AlaAsp Trp Ser Ser Val Arg Lys Val Ala 260 265 270 Asp Arg Lys Gly His AspLeu Arg Tyr Ser Leu Asp Gly Gly Glu Ile 275 280 285 Glu Arg Glu Leu GlyTyr Arg Pro Gln Val Ser Phe Ala Asp Gly Leu 290 295 300 Ala Arg Thr ValArg Trp Tyr Arg Glu Asn Arg Gly Trp Trp Glu Pro 305 310 315 320 Leu LysAla Thr Ala Pro Gln Leu Pro Ala Thr Ala Val Glu Val Ser 325 330 335 Ala15 1140 DNA Streptomyces venezuelae 15 gtgagcagcc gcgccgagac cccccgcgtccccttcctcg acctcaaggc cgcctacgag 60 gagctccgcg cggagaccga cgccgcgatcgcccgcgtcc tcgactcggg gcgctacctc 120 ctcggacccg aactcgaagg attcgaggcggagttcgccg cgtactgcga gacggaccac 180 gccgtcggcg tgaacagcgg gatggacgccctccagctcg ccctccgcgg cctcggcatc 240 ggacccgggg acgaggtgat cgtcccctcgcacacgtaca tcgccagctg gctcgcggtg 300 tccgccaccg gcgcgacccc cgtgcccgtcgagccgcacg aggaccaccc caccctggac 360 ccgctgctcg tcgagaaggc gatcaccccccgcacccggg cgctcctccc cgtccacctc 420 tacgggcacc ccgccgacat ggacgccctccgcgagctcg cggaccggca cggcctgcac 480 atcgtcgagg acgccgcgca ggcccacggcgcccgctacc ggggccggcg gatcggcgcc 540 gggtcgtcgg tggccgcgtt cagcttctacccgggcaaga acctcggctg cttcggcgac 600 ggcggcgccg tcgtcaccgg cgaccccgagctcgccgaac ggctccggat gctccgcaac 660 tacggctcgc ggcagaagta cagccacgagacgaagggca ccaactcccg cctggacgag 720 atgcaggccg ccgtgctgcg gatccggctcgcccacctgg acagctggaa cggccgcagg 780 tcggcgctgg ccgcggagta cctctccgggctcgccggac tgcccggcat cggcctgccg 840 gtgaccgcgc ccgacaccga cccggtctggcacctcttca ccgtgcgcac cgagcgccgc 900 gacgagctgc gcagccacct cgacgcccgcggcatcgaca ccctcacgca ctacccggta 960 cccgtgcacc tctcgcccgc ctacgcgggcgaggcaccgc cggaaggctc gctcccgcgg 1020 gccgagagct tcgcgcggca ggtcctcagcctgccgatcg gcccgcacct ggagcgcccg 1080 caggcgctgc gggtgatcga cgccgtgcgcgaatgggccg agcgggtcga ccaggcctag 1140 16 379 PRT Streptomyces venezuelae16 Met Ser Ser Arg Ala Glu Thr Pro Arg Val Pro Phe Leu Asp Leu Lys 1 510 15 Ala Ala Tyr Glu Glu Leu Arg Ala Glu Thr Asp Ala Ala Ile Ala Arg 2025 30 Val Leu Asp Ser Gly Arg Tyr Leu Leu Gly Pro Glu Leu Glu Gly Phe 3540 45 Glu Ala Glu Phe Ala Ala Tyr Cys Glu Thr Asp His Ala Val Gly Val 5055 60 Asn Ser Gly Met Asp Ala Leu Gln Leu Ala Leu Arg Gly Leu Gly Ile 6570 75 80 Gly Pro Gly Asp Glu Val Ile Val Pro Ser His Thr Tyr Ile Ala Ser85 90 95 Trp Leu Ala Val Ser Ala Thr Gly Ala Thr Pro Val Pro Val Glu Pro100 105 110 His Glu Asp His Pro Thr Leu Asp Pro Leu Leu Val Glu Lys AlaIle 115 120 125 Thr Pro Arg Thr Arg Ala Leu Leu Pro Val His Leu Tyr GlyHis Pro 130 135 140 Ala Asp Met Asp Ala Leu Arg Glu Leu Ala Asp Arg HisGly Leu His 145 150 155 160 Ile Val Glu Asp Ala Ala Gln Ala His Gly AlaArg Tyr Arg Gly Arg 165 170 175 Arg Ile Gly Ala Gly Ser Ser Val Ala AlaPhe Ser Phe Tyr Pro Gly 180 185 190 Lys Asn Leu Gly Cys Phe Gly Asp GlyGly Ala Val Val Thr Gly Asp 195 200 205 Pro Glu Leu Ala Glu Arg Leu ArgMet Leu Arg Asn Tyr Gly Ser Arg 210 215 220 Gln Lys Tyr Ser His Glu ThrLys Gly Thr Asn Ser Arg Leu Asp Glu 225 230 235 240 Met Gln Ala Ala ValLeu Arg Ile Arg Leu Ala His Leu Asp Ser Trp 245 250 255 Asn Gly Arg ArgSer Ala Leu Ala Ala Glu Tyr Leu Ser Gly Leu Ala 260 265 270 Gly Leu ProGly Ile Gly Leu Pro Val Thr Ala Pro Asp Thr Asp Pro 275 280 285 Val TrpHis Leu Phe Thr Val Arg Thr Glu Arg Arg Asp Glu Leu Arg 290 295 300 SerHis Leu Asp Ala Arg Gly Ile Asp Thr Leu Thr His Tyr Pro Val 305 310 315320 Pro Val His Leu Ser Pro Ala Tyr Ala Gly Glu Ala Pro Pro Glu Gly 325330 335 Ser Leu Pro Arg Ala Glu Ser Phe Ala Arg Gln Val Leu Ser Leu Pro340 345 350 Ile Gly Pro His Leu Glu Arg Pro Gln Ala Leu Arg Val Ile AspAla 355 360 365 Val Arg Glu Trp Ala Glu Arg Val Asp Gln Ala 370 375 17714 DNA Streptomyces venezuelae 17 gtgtacgaag tcgaccacgc cgacgtctacgacctcttct acctgggtcg cggcaaggac 60 tacgccgccg aggcctccga catcgccgacctggtgcgct cccgtacccc cgaggcctcc 120 tcgctcctgg acgtggcctg cggtacgggcacgcatctgg agcacttcac caaggagttc 180 ggcgacaccg ccggcctgga gctgtccgaggacatgctca cccacgcccg caagcggctg 240 cccgacgcca cgctccacca gggcgacatgcgggacttcc ggctcggccg gaagttctcc 300 gccgtggtca gcatgttcag ctccgtcggctacctgaaga cgaccgagga actcggcgcg 360 gccgtcgcct cgttcgcgga gcacctggagcccggtggcg tcgtcgtcgt cgagccgtgg 420 tggttcccgg agaccttcgc cgacggctgggtcagcgccg acgtcgtccg ccgtgacggg 480 cgcaccgtgg cccgtgtctc gcactcggtgcgggagggga acgcgacgcg catggaggtc 540 cacttcaccg tggccgaccc gggcaagggcgtgcggcact tctccgacgt ccatctcatc 600 accctgttcc accaggccga gtacgaggccgcgttcacgg ccgccgggct gcgcgtcgag 660 tacctggagg gcggcccgtc gggccgtggcctcttcgtcg gcgtccccgc ctga 714 18 237 PRT Streptomyces venezuelae 18 MetTyr Glu Val Asp His Ala Asp Val Tyr Asp Leu Phe Tyr Leu Gly 1 5 10 15Arg Gly Lys Asp Tyr Ala Ala Glu Ala Ser Asp Ile Ala Asp Leu Val 20 25 30Arg Ser Arg Thr Pro Glu Ala Ser Ser Leu Leu Asp Val Ala Cys Gly 35 40 45Thr Gly Thr His Leu Glu His Phe Thr Lys Glu Phe Gly Asp Thr Ala 50 55 60Gly Leu Glu Leu Ser Glu Asp Met Leu Thr His Ala Arg Lys Arg Leu 65 70 7580 Pro Asp Ala Thr Leu His Gln Gly Asp Met Arg Asp Phe Arg Leu Gly 85 9095 Arg Lys Phe Ser Ala Val Val Ser Met Phe Ser Ser Val Gly Tyr Leu 100105 110 Lys Thr Thr Glu Glu Leu Gly Ala Ala Val Ala Ser Phe Ala Glu His115 120 125 Leu Glu Pro Gly Gly Val Val Val Val Glu Pro Trp Trp Phe ProGlu 130 135 140 Thr Phe Ala Asp Gly Trp Val Ser Ala Asp Val Val Arg ArgAsp Gly 145 150 155 160 Arg Thr Val Ala Arg Val Ser His Ser Val Arg GluGly Asn Ala Thr 165 170 175 Arg Met Glu Val His Phe Thr Val Ala Asp ProGly Lys Gly Val Arg 180 185 190 His Phe Ser Asp Val His Leu Ile Thr LeuPhe His Gln Ala Glu Tyr 195 200 205 Glu Ala Ala Phe Thr Ala Ala Gly LeuArg Val Glu Tyr Leu Glu Gly 210 215 220 Gly Pro Ser Gly Arg Gly Leu PheVal Gly Val Pro Ala 225 230 235 19 1281 DNA Streptomyces venezuelae 19atgcgcgtcc tgctgacctc gttcgcacat cacacgcact actacggcct ggtgcccctg 60gcctgggcgc tgctcgccgc cgggcacgag gtgcgggtcg ccagccagcc cgcgctcacg 120gacaccatca ccgggtccgg gctcgccgcg gtgccggtcg gcaccgacca cctcatccac 180gagtaccggg tgcggatggc gggcgagccg cgcccgaacc atccggcgat cgccttcgac 240gaggcccgtc ccgagccgct ggactgggac cacgccctcg gcatcgaggc gatcctcgcc 300ccgtacttcc atctgctcgc caacaacgac tcgatggtcg acgacctcgt cgacttcgcc 360cggtcctggc agccggacct ggtgctgtgg gagccgacga cctacgcggg cgccgtcgcc 420gcccaggtca ccggtgccgc gcacgcccgg gtcctgtggg ggcccgacgt gatgggcagc 480gcccgccgca agttcgtcgc gctgcgggac cggcagccgc ccgagcaccg cgaggacccc 540accgcggagt ggctgacgtg gacgctcgac cggtacggcg cctccttcga agaggagctg 600ctcaccggcc agttcacgat cgacccgacc ccgccgagcc tgcgcctcga cacgggcctg 660ccgaccgtcg ggatgcgtta tgttccgtac aacggcacgt cggtcgtgcc ggactggctg 720agtgagccgc ccgcgcggcc ccgggtctgc ctgaccctcg gcgtctccgc gcgtgaggtc 780ctcggcggcg acggcgtctc gcagggcgac atcctggagg cgctcgccga cctcgacatc 840gagctcgtcg ccacgctcga cgcgagtcag cgcgccgaga tccgcaacta cccgaagcac 900acccggttca cggacttcgt gccgatgcac gcgctcctgc cgagctgctc ggcgatcatc 960caccacggcg gggcgggcac ctacgcgacc gccgtgatca acgcggtgcc gcaggtcatg 1020ctcgccgagc tgtgggacgc gccggtcaag gcgcgggccg tcgccgagca gggggcgggg 1080ttcttcctgc cgccggccga gctcacgccg caggccgtgc gggacgccgt cgtccgcatc 1140ctcgacgacc cctcggtcgc caccgccgcg caccggctgc gcgaggagac cttcggcgac 1200cccaccccgg ccgggatcgt ccccgagctg gagcggctcg ccgcgcagca ccgccgcccg 1260ccggccgacg cccggcactg a 1281 20 426 PRT Streptomyces venezuelae 20 MetArg Val Leu Leu Thr Ser Phe Ala His His Thr His Tyr Tyr Gly 1 5 10 15Leu Val Pro Leu Ala Trp Ala Leu Leu Ala Ala Gly His Glu Val Arg 20 25 30Val Ala Ser Gln Pro Ala Leu Thr Asp Thr Ile Thr Gly Ser Gly Leu 35 40 45Ala Ala Val Pro Val Gly Thr Asp His Leu Ile His Glu Tyr Arg Val 50 55 60Arg Met Ala Gly Glu Pro Arg Pro Asn His Pro Ala Ile Ala Phe Asp 65 70 7580 Glu Ala Arg Pro Glu Pro Leu Asp Trp Asp His Ala Leu Gly Ile Glu 85 9095 Ala Ile Leu Ala Pro Tyr Phe His Leu Leu Ala Asn Asn Asp Ser Met 100105 110 Val Asp Asp Leu Val Asp Phe Ala Arg Ser Trp Gln Pro Asp Leu Val115 120 125 Leu Trp Glu Pro Thr Thr Tyr Ala Gly Ala Val Ala Ala Gln ValThr 130 135 140 Gly Ala Ala His Ala Arg Val Leu Trp Gly Pro Asp Val MetGly Ser 145 150 155 160 Ala Arg Arg Lys Phe Val Ala Leu Arg Asp Arg GlnPro Pro Glu His 165 170 175 Arg Glu Asp Pro Thr Ala Glu Trp Leu Thr TrpThr Leu Asp Arg Tyr 180 185 190 Gly Ala Ser Phe Glu Glu Glu Leu Leu ThrGly Gln Phe Thr Ile Asp 195 200 205 Pro Thr Pro Pro Ser Leu Arg Leu AspThr Gly Leu Pro Thr Val Gly 210 215 220 Met Arg Tyr Val Pro Tyr Asn GlyThr Ser Val Val Pro Asp Trp Leu 225 230 235 240 Ser Glu Pro Pro Ala ArgPro Arg Val Cys Leu Thr Leu Gly Val Ser 245 250 255 Ala Arg Glu Val LeuGly Gly Asp Gly Val Ser Gln Gly Asp Ile Leu 260 265 270 Glu Ala Leu AlaAsp Leu Asp Ile Glu Leu Val Ala Thr Leu Asp Ala 275 280 285 Ser Gln ArgAla Glu Ile Arg Asn Tyr Pro Lys His Thr Arg Phe Thr 290 295 300 Asp PheVal Pro Met His Ala Leu Leu Pro Ser Cys Ser Ala Ile Ile 305 310 315 320His His Gly Gly Ala Gly Thr Tyr Ala Thr Ala Val Ile Asn Ala Val 325 330335 Pro Gln Val Met Leu Ala Glu Leu Trp Asp Ala Pro Val Lys Ala Arg 340345 350 Ala Val Ala Glu Gln Gly Ala Gly Phe Phe Leu Pro Pro Ala Glu Leu355 360 365 Thr Pro Gln Ala Val Arg Asp Ala Val Val Arg Ile Leu Asp AspPro 370 375 380 Ser Val Ala Thr Ala Ala His Arg Leu Arg Glu Glu Thr PheGly Asp 385 390 395 400 Pro Thr Pro Ala Gly Ile Val Pro Glu Leu Glu ArgLeu Ala Ala Gln 405 410 415 His Arg Arg Pro Pro Ala Asp Ala Arg His 420425 21 1209 DNA Streptomyces venezuelae 21 gtgaccgacg acctgacgggggccctcacg cagcccccgc tgggccgcac cgtccgcgcg 60 gtggccgacc gtgaactcggcacccacctc ctggagaccc gcggcatcca ctggatccac 120 gccgcgaacg gcgacccgtacgccaccgtg ctgcgcggcc aggcggacga cccgtatccc 180 gcgtacgagc gggtgcgtgcccgcggcgcg ctctccttca gcccgacggg cagctgggtc 240 accgccgatc acgccctggcggcgagcatc ctctgctcga cggacttcgg ggtctccggc 300 gccgacggcg tcccggtgccgcagcaggtc ctctcgtacg gggagggctg tccgctggag 360 cgcgagcagg tgctgccggcggccggtgac gtgccggagg gcgggcagcg tgccgtggtc 420 gaggggatcc accgggagacgctggagggt ctcgcgccgg acccgtcggc gtcgtacgcc 480 ttcgagctgc tgggcggtttcgtccgcccg gcggtgacgg ccgctgccgc cgccgtgctg 540 ggtgttcccg cggaccggcgcgcggacttc gcggatctgc tggagcggct ccggccgctg 600 tccgacagcc tgctggccccgcagtccctg cggacggtac gggcggcgga cggcgcgctg 660 gccgagctca cggcgctgctcgccgattcg gacgactccc ccggggccct gctgtcggcg 720 ctcggggtca ccgcagccgtccagctcacc gggaacgcgg tgctcgcgct cctcgcgcat 780 cccgagcagt ggcgggagctgtgcgaccgg cccgggctcg cggcggccgc ggtggaggag 840 accctccgct acgacccgccggtgcagctc gacgcccggg tggtccgcgg ggagacggag 900 ctggcgggcc ggcggctgccggccggggcg catgtcgtcg tcctgaccgc cgcgaccggc 960 cgggacccgg aggtcttcacggacccggag cgcttcgacc tcgcgcgccc cgacgccgcc 1020 gcgcacctcg cgctgcaccccgccggtccg tacggcccgg tggcgtccct ggtccggctt 1080 caggcggagg tcgcgctgcggaccctggcc gggcgtttcc ccgggctgcg gcaggcgggg 1140 gacgtgctcc gcccccgccgcgcgcctgtc ggccgcgggc cgctgagcgt cccggtcagc 1200 agctcctga 1209 22 402PRT Streptomyces venezuelae 22 Met Thr Asp Asp Leu Thr Gly Ala Leu ThrGln Pro Pro Leu Gly Arg 1 5 10 15 Thr Val Arg Ala Val Ala Asp Arg GluLeu Gly Thr His Leu Leu Glu 20 25 30 Thr Arg Gly Ile His Trp Ile His AlaAla Asn Gly Asp Pro Tyr Ala 35 40 45 Thr Val Leu Arg Gly Gln Ala Asp AspPro Tyr Pro Ala Tyr Glu Arg 50 55 60 Val Arg Ala Arg Gly Ala Leu Ser PheSer Pro Thr Gly Ser Trp Val 65 70 75 80 Thr Ala Asp His Ala Leu Ala AlaSer Ile Leu Cys Ser Thr Asp Phe 85 90 95 Gly Val Ser Gly Ala Asp Gly ValPro Val Pro Gln Gln Val Leu Ser 100 105 110 Tyr Gly Glu Gly Cys Pro LeuGlu Arg Glu Gln Val Leu Pro Ala Ala 115 120 125 Gly Asp Val Pro Glu GlyGly Gln Arg Ala Val Val Glu Gly Ile His 130 135 140 Arg Glu Thr Leu GluGly Leu Ala Pro Asp Pro Ser Ala Ser Tyr Ala 145 150 155 160 Phe Glu LeuLeu Gly Gly Phe Val Arg Pro Ala Val Thr Ala Ala Ala 165 170 175 Ala AlaVal Leu Gly Val Pro Ala Asp Arg Arg Ala Asp Phe Ala Asp 180 185 190 LeuLeu Glu Arg Leu Arg Pro Leu Ser Asp Ser Leu Leu Ala Pro Gln 195 200 205Ser Leu Arg Thr Val Arg Ala Ala Asp Gly Ala Leu Ala Glu Leu Thr 210 215220 Ala Leu Leu Ala Asp Ser Asp Asp Ser Pro Gly Ala Leu Leu Ser Ala 225230 235 240 Leu Gly Val Thr Ala Ala Val Gln Leu Thr Gly Asn Ala Val LeuAla 245 250 255 Leu Leu Ala His Pro Glu Gln Trp Arg Glu Leu Cys Asp ArgPro Gly 260 265 270 Leu Ala Ala Ala Ala Val Glu Glu Thr Leu Arg Tyr AspPro Pro Val 275 280 285 Gln Leu Asp Ala Arg Val Val Arg Gly Glu Thr GluLeu Ala Gly Arg 290 295 300 Arg Leu Pro Ala Gly Ala His Val Val Val LeuThr Ala Ala Thr Gly 305 310 315 320 Arg Asp Pro Glu Val Phe Thr Asp ProGlu Arg Phe Asp Leu Ala Arg 325 330 335 Pro Asp Ala Ala Ala His Leu AlaLeu His Pro Ala Gly Pro Tyr Gly 340 345 350 Pro Val Ala Ser Leu Val ArgLeu Gln Ala Glu Val Ala Leu Arg Thr 355 360 365 Leu Ala Gly Arg Phe ProGly Leu Arg Gln Ala Gly Asp Val Leu Arg 370 375 380 Pro Arg Arg Ala ProVal Gly Arg Gly Pro Leu Ser Val Pro Val Ser 385 390 395 400 Ser Ser 232430 DNA Streptomyces venezuelae 23 gtgacaggta agacccgaat accgcgtgtccgccgcggcc gcaccacgcc cagggccttc 60 accctggccg tcgtcggcac cctgctggcgggcaccaccg tggcggccgc cgctcccggc 120 gccgccgaca cggccaatgt tcagtacacgagccgggcgg cggagctcgt cgcccagatg 180 acgctcgacg agaagatcag cttcgtccactgggcgctgg accccgaccg gcagaacgtc 240 ggctaccttc ccggcgtgcc gcgtctgggcatcccggagc tgcgtgccgc cgacggcccg 300 aacggcatcc gcctggtggg gcagaccgccaccgcgctgc ccgcgccggt cgccctggcc 360 agcaccttcg acgacaccat ggccgacagctacggcaagg tcatgggccg cgacggtcgc 420 gcgctcaacc aggacatggt cctgggcccgatgatgaaca acatccgggt gccgcacggc 480 ggccggaact acgagacctt cagcgaggaccccctggtct cctcgcgcac cgcggtcgcc 540 cagatcaagg gcatccaggg tgcgggtctgatgaccacgg ccaagcactt cgcggccaac 600 aaccaggaga acaaccgctt ctccgtgaacgccaatgtcg acgagcagac gctccgcgag 660 atcgagttcc cggcgttcga ggcgtcctccaaggccggcg cggcctcctt catgtgtgcc 720 tacaacggcc tcaacgggaa gccgtcctgcggcaacgacg agctcctcaa caacgtgctg 780 cgcacgcagt ggggcttcca gggctgggtgatgtccgact ggctcgccac cccgggcacc 840 gacgccatca ccaagggcct cgaccaggagatgggcgtcg agctccccgg cgacgtcccg 900 aagggcgagc cctcgccgcc ggccaagttcttcggcgagg cgctgaagac ggccgtcctg 960 aacggcacgg tccccgaggc ggccgtgacgcggtcggcgg agcggatcgt cggccagatg 1020 gagaagttcg gtctgctcct cgccactccggcgccgcggc ccgagcgcga caaggcgggt 1080 gcccaggcgg tgtcccgcaa ggtcgccgagaacggcgcgg tgctcctgcg caacgagggc 1140 caggccctgc cgctcgccgg tgacgccggcaagagcatcg cggtcatcgg cccgacggcc 1200 gtcgacccca aggtcaccgg cctgggcagcgcccacgtcg tcccggactc ggcggcggcg 1260 ccactcgaca ccatcaaggc ccgcgcgggtgcgggtgcga cggtgacgta cgagacgggt 1320 gaggagacct tcgggacgca gatcccggcggggaacctca gcccggcgtt caaccagggc 1380 caccagctcg agccgggcaa ggcgggggcgctgtacgacg gcacgctgac cgtgcccgcc 1440 gacggcgagt accgcatcgc ggtccgtgccaccggtggtt acgccacggt gcagctcggc 1500 agccacacca tcgaggccgg tcaggtctacggcaaggtga gcagcccgct cctcaagctg 1560 accaagggca cgcacaagct cacgatctcgggcttcgcga tgagtgccac cccgctctcc 1620 ctggagctgg gctgggtgac gccggcggcggccgacgcga cgatcgcgaa ggccgtggag 1680 tcggcgcgga aggcccgtac ggcggtcgtcttcgcctacg acgacggcac cgagggcgtc 1740 gaccgtccga acctgtcgct gccgggtacgcaggacaagc tgatctcggc tgtcgcggac 1800 gccaacccga acacgatcgt ggtcctcaacaccggttcgt cggtgctgat gccgtggctg 1860 tccaagaccc gcgcggtcct ggacatgtggtacccgggcc aggcgggcgc cgaggccacc 1920 gccgcgctgc tctacggtga cgtcaacccgagcggcaagc tcacgcagag cttcccggcc 1980 gccgagaacc agcacgcggt cgccggcgacccgacaagct acccgggcgt cgacaaccag 2040 cagacgtacc gcgagggcat ccacgtcgggtaccgctggt tcgacaagga gaacgtcaag 2100 ccgctgttcc cgttcgggca cggcctgtcgtacacctcgt tcacgcagag cgccccgacc 2160 gtcgtgcgta cgtccacggg tggtctgaaggtcacggtca cggtccgcaa cagcgggaag 2220 cgcgccggcc aggaggtcgt ccaggcgtacctcggtgcca gcccgaacgt gacggctccg 2280 caggcgaaga agaagctcgt gggctacacgaaggtctcgc tcgccgcggg cgaggcgaag 2340 acggtgacgg tgaacgtcga ccgccgtcagctgcagaccg gttcgtcctc cgccgacctg 2400 cggggcagcg ccacggtcaa cgtctggtga2430 24 809 PRT Streptomyces venezuelae 24 Met Thr Gly Lys Thr Arg IlePro Arg Val Arg Arg Gly Arg Thr Thr 1 5 10 15 Pro Arg Ala Phe Thr LeuAla Val Val Gly Thr Leu Leu Ala Gly Thr 20 25 30 Thr Val Ala Ala Ala AlaPro Gly Ala Ala Asp Thr Ala Asn Val Gln 35 40 45 Tyr Thr Ser Arg Ala AlaGlu Leu Val Ala Gln Met Thr Leu Asp Glu 50 55 60 Lys Ile Ser Phe Val HisTrp Ala Leu Asp Pro Asp Arg Gln Asn Val 65 70 75 80 Gly Tyr Leu Pro GlyVal Pro Arg Leu Gly Ile Pro Glu Leu Arg Ala 85 90 95 Ala Asp Gly Pro AsnGly Ile Arg Leu Val Gly Gln Thr Ala Thr Ala 100 105 110 Leu Pro Ala ProVal Ala Leu Ala Ser Thr Phe Asp Asp Thr Met Ala 115 120 125 Asp Ser TyrGly Lys Val Met Gly Arg Asp Gly Arg Ala Leu Asn Gln 130 135 140 Asp MetVal Leu Gly Pro Met Met Asn Asn Ile Arg Val Pro His Gly 145 150 155 160Gly Arg Asn Tyr Glu Thr Phe Ser Glu Asp Pro Leu Val Ser Ser Arg 165 170175 Thr Ala Val Ala Gln Ile Lys Gly Ile Gln Gly Ala Gly Leu Met Thr 180185 190 Thr Ala Lys His Phe Ala Ala Asn Asn Gln Glu Asn Asn Arg Phe Ser195 200 205 Val Asn Ala Asn Val Asp Glu Gln Thr Leu Arg Glu Ile Glu PhePro 210 215 220 Ala Phe Glu Ala Ser Ser Lys Ala Gly Ala Ala Ser Phe MetCys Ala 225 230 235 240 Tyr Asn Gly Leu Asn Gly Lys Pro Ser Cys Gly AsnAsp Glu Leu Leu 245 250 255 Asn Asn Val Leu Arg Thr Gln Trp Gly Phe GlnGly Trp Val Met Ser 260 265 270 Asp Trp Leu Ala Thr Pro Gly Thr Asp AlaIle Thr Lys Gly Leu Asp 275 280 285 Gln Glu Met Gly Val Glu Leu Pro GlyAsp Val Pro Lys Gly Glu Pro 290 295 300 Ser Pro Pro Ala Lys Phe Phe GlyGlu Ala Leu Lys Thr Ala Val Leu 305 310 315 320 Asn Gly Thr Val Pro GluAla Ala Val Thr Arg Ser Ala Glu Arg Ile 325 330 335 Val Gly Gln Met GluLys Phe Gly Leu Leu Leu Ala Thr Pro Ala Pro 340 345 350 Arg Pro Glu ArgAsp Lys Ala Gly Ala Gln Ala Val Ser Arg Lys Val 355 360 365 Ala Glu AsnGly Ala Val Leu Leu Arg Asn Glu Gly Gln Ala Leu Pro 370 375 380 Leu AlaGly Asp Ala Gly Lys Ser Ile Ala Val Ile Gly Pro Thr Ala 385 390 395 400Val Asp Pro Lys Val Thr Gly Leu Gly Ser Ala His Val Val Pro Asp 405 410415 Ser Ala Ala Ala Pro Leu Asp Thr Ile Lys Ala Arg Ala Gly Ala Gly 420425 430 Ala Thr Val Thr Tyr Glu Thr Gly Glu Glu Thr Phe Gly Thr Gln Ile435 440 445 Pro Ala Gly Asn Leu Ser Pro Ala Phe Asn Gln Gly His Gln LeuGlu 450 455 460 Pro Gly Lys Ala Gly Ala Leu Tyr Asp Gly Thr Leu Thr ValPro Ala 465 470 475 480 Asp Gly Glu Tyr Arg Ile Ala Val Arg Ala Thr GlyGly Tyr Ala Thr 485 490 495 Val Gln Leu Gly Ser His Thr Ile Glu Ala GlyGln Val Tyr Gly Lys 500 505 510 Val Ser Ser Pro Leu Leu Lys Leu Thr LysGly Thr His Lys Leu Thr 515 520 525 Ile Ser Gly Phe Ala Met Ser Ala ThrPro Leu Ser Leu Glu Leu Gly 530 535 540 Trp Val Thr Pro Ala Ala Ala AspAla Thr Ile Ala Lys Ala Val Glu 545 550 555 560 Ser Ala Arg Lys Ala ArgThr Ala Val Val Phe Ala Tyr Asp Asp Gly 565 570 575 Thr Glu Gly Val AspArg Pro Asn Leu Ser Leu Pro Gly Thr Gln Asp 580 585 590 Lys Leu Ile SerAla Val Ala Asp Ala Asn Pro Asn Thr Ile Val Val 595 600 605 Leu Asn ThrGly Ser Ser Val Leu Met Pro Trp Leu Ser Lys Thr Arg 610 615 620 Ala ValLeu Asp Met Trp Tyr Pro Gly Gln Ala Gly Ala Glu Ala Thr 625 630 635 640Ala Ala Leu Leu Tyr Gly Asp Val Asn Pro Ser Gly Lys Leu Thr Gln 645 650655 Ser Phe Pro Ala Ala Glu Asn Gln His Ala Val Ala Gly Asp Pro Thr 660665 670 Ser Tyr Pro Gly Val Asp Asn Gln Gln Thr Tyr Arg Glu Gly Ile His675 680 685 Val Gly Tyr Arg Trp Phe Asp Lys Glu Asn Val Lys Pro Leu PhePro 690 695 700 Phe Gly His Gly Leu Ser Tyr Thr Ser Phe Thr Gln Ser AlaPro Thr 705 710 715 720 Val Val Arg Thr Ser Thr Gly Gly Leu Lys Val ThrVal Thr Val Arg 725 730 735 Asn Ser Gly Lys Arg Ala Gly Gln Glu Val ValGln Ala Tyr Leu Gly 740 745 750 Ala Ser Pro Asn Val Thr Ala Pro Gln AlaLys Lys Lys Leu Val Gly 755 760 765 Tyr Thr Lys Val Ser Leu Ala Ala GlyGlu Ala Lys Thr Val Thr Val 770 775 780 Asn Val Asp Arg Arg Gln Leu GlnThr Gly Ser Ser Ser Ala Asp Leu 785 790 795 800 Arg Gly Ser Ala Thr ValAsn Val Trp 805 25 9 PRT Artificial Sequence A consensus sequence. 25Leu Leu Asp Val Ala Cys Gly Thr Gly 1 5 26 1011 DNA Streptomycesvenezuelae 26 atggcaatgc gcgactccat accgaggcga gcggaccgcg acacccttcgccgcgaatta 60 ggccagaact tccttcagga cgacagagcc gtgcgcaatc tcgtcacgcatgtcgagggg 120 gacggtagga acgttctcga aatcggcccc ggaaagggcg cgataaccgaggagttggtg 180 cgctccttcg acaccgtgac ggtcgtggag atggacccgc actgggccgcgcatgtgcgg 240 cggaaattcg aaggggagag ggtcaccgta ttccagggtg atttcctcgacttccgcatt 300 ccgcgcgata tcgacaccgt cgtcggaaac gttcccttcg gcatcacgacccagattctc 360 cggagtctcc tggaatcgac gaactggcag tcggcggccc tgatagtgcagtgggaggtc 420 gcccgcaaac gcgccggtcg cagcggcgga tcgctcctca cgacctcctgggccccctgg 480 tacgagttcg cggtccacga ccgcgtccgc gcctcgtcgt tccgtccgatgccccgcgtc 540 gacggcggcg tcctgacgat caggcgacgc ccccagcccc tgctgcccgagagcgcgagc 600 cgcgccttcc agaacttcgc cgaagccgtc ttcaccggcc ccggacggggcctcgcggag 660 atcctccggc gccacatccc caagcggacc taccgttccc tcgccgaccgccacggaatt 720 ccggacggcg gactgccgaa ggacctcacg ctcacccaat ggatcgcccttttccaggcc 780 tcccagccga gttacgcgcc gggggcgccc ggcacgcgca tgccgggccagggcggtggc 840 gccggcggca gggactatga ctcggagacg agcagggccg ccgtgcccgggagccgcaga 900 tacggcccca cgcgcggcgg cgaaccctgc gcaccccgcg cacaggtccggcagaccaag 960 ggccgccagg gcgcgcgagg ctcgtcgtac ggacgccgca cgggccgtta g1011 27 336 PRT Streptomyces venezuelae 27 Met Ala Met Arg Asp Ser IlePro Arg Arg Ala Asp Arg Asp Thr Leu 1 5 10 15 Arg Arg Glu Leu Gly GlnAsn Phe Leu Gln Asp Asp Arg Ala Val Arg 20 25 30 Asn Leu Val Thr His ValGlu Gly Asp Gly Arg Asn Val Leu Glu Ile 35 40 45 Gly Pro Gly Lys Gly AlaIle Thr Glu Glu Leu Val Arg Ser Phe Asp 50 55 60 Thr Val Thr Val Val GluMet Asp Pro His Trp Ala Ala His Val Arg 65 70 75 80 Arg Lys Phe Glu GlyGlu Arg Val Thr Val Phe Gln Gly Asp Phe Leu 85 90 95 Asp Phe Arg Ile ProArg Asp Ile Asp Thr Val Val Gly Asn Val Pro 100 105 110 Phe Gly Ile ThrThr Gln Ile Leu Arg Ser Leu Leu Glu Ser Thr Asn 115 120 125 Trp Gln SerAla Ala Leu Ile Val Gln Trp Glu Val Ala Arg Lys Arg 130 135 140 Ala GlyArg Ser Gly Gly Ser Leu Leu Thr Thr Ser Trp Ala Pro Trp 145 150 155 160Tyr Glu Phe Ala Val His Asp Arg Val Arg Ala Ser Ser Phe Arg Pro 165 170175 Met Pro Arg Val Asp Gly Gly Val Leu Thr Ile Arg Arg Arg Pro Gln 180185 190 Pro Leu Leu Pro Glu Ser Ala Ser Arg Ala Phe Gln Asn Phe Ala Glu195 200 205 Ala Val Phe Thr Gly Pro Gly Arg Gly Leu Ala Glu Ile Leu ArgArg 210 215 220 His Ile Pro Lys Arg Thr Tyr Arg Ser Leu Ala Asp Arg HisGly Ile 225 230 235 240 Pro Asp Gly Gly Leu Pro Lys Asp Leu Thr Leu ThrGln Trp Ile Ala 245 250 255 Leu Phe Gln Ala Ser Gln Pro Ser Tyr Ala ProGly Ala Pro Gly Thr 260 265 270 Arg Met Pro Gly Gln Gly Gly Gly Ala GlyGly Arg Asp Tyr Asp Ser 275 280 285 Glu Thr Ser Arg Ala Ala Val Pro GlySer Arg Arg Tyr Gly Pro Thr 290 295 300 Arg Gly Gly Glu Pro Cys Ala ProArg Ala Gln Val Arg Gln Thr Lys 305 310 315 320 Gly Arg Gln Gly Ala ArgGly Ser Ser Tyr Gly Arg Arg Thr Gly Arg 325 330 335 28 969 DNAStreptomyces venezuelae 28 atggcatttt ccccgcaggg cggccgacac gagctcggtcagaacttcct cgtcgaccgg 60 tcagtgatcg acgagatcga cggcctggtg gccaggaccaagggtccgat actggagatc 120 ggtccgggtg acggcgccct gaccctgccg ctgagcaggcacggcaggcc gatcaccgcc 180 gtcgagctcg acggccggcg cgcgcagcgc ctcggtgcccgcacccccgg tcatgtgacc 240 gtggtgcacc acgacttcct gcagtacccg ctgccgcgcaacccgcatgt ggtcgtcggc 300 aacgtcccct tccatctgac gacggcgatc atgcggcggctgctcgacgc ccagcactgg 360 cacaccgccg tcctcctcgt ccagtgggag gtcgcccggcgccgggccgg cgtcggcggg 420 tcgacgctgc tgacggccgg ctgggcgccc tggtacgagttcgacctgca ctcccgggtc 480 cccgcgcggg ccttccgtcc gatgccgggc gtggacggaggagtactggc catccggcgg 540 cggtccgcgc cgctcgtggg ccaggtgaag acgtaccaggacttcgtacg ccaggtgttc 600 accggcaagg ggaacgggct gaaggagatc ctgcggcggaccgggcggat ctcgcagcgg 660 gacctggcga cctggctgcg gaggaacgag atctcgccgcacgcgctgcc caaggacctg 720 aagcccgggc agtgggcgtc gctgtgggag ctgaccggcggcacggccga cggatccttc 780 gacggtacgg cgggcggtgg cgcggccgga tcgcacggggcggctcgggt cggggccggt 840 cacccgggcg gccgggtgtc cgcgagccgg cggggcgtgccgcaggcgcg gcgcggccgg 900 gggcatgcgg tacggagctc cacggggacc gagccgaggtggggcagggg gcgggcggag 960 agcgcgtga 969 29 322 PRT Streptomycesvenezuelae 29 Met Ala Phe Ser Pro Gln Gly Gly Arg His Glu Leu Gly GlnAsn Phe 1 5 10 15 Leu Val Asp Arg Ser Val Ile Asp Glu Ile Asp Gly LeuVal Ala Arg 20 25 30 Thr Lys Gly Pro Ile Leu Glu Ile Gly Pro Gly Asp GlyAla Leu Thr 35 40 45 Leu Pro Leu Ser Arg His Gly Arg Pro Ile Thr Ala ValGlu Leu Asp 50 55 60 Gly Arg Arg Ala Gln Arg Leu Gly Ala Arg Thr Pro GlyHis Val Thr 65 70 75 80 Val Val His His Asp Phe Leu Gln Tyr Pro Leu ProArg Asn Pro His 85 90 95 Val Val Val Gly Asn Val Pro Phe His Leu Thr ThrAla Ile Met Arg 100 105 110 Arg Leu Leu Asp Ala Gln His Trp His Thr AlaVal Leu Leu Val Gln 115 120 125 Trp Glu Val Ala Arg Arg Arg Ala Gly ValGly Gly Ser Thr Leu Leu 130 135 140 Thr Ala Gly Trp Ala Pro Trp Tyr GluPhe Asp Leu His Ser Arg Val 145 150 155 160 Pro Ala Arg Ala Phe Arg ProMet Pro Gly Val Asp Gly Gly Val Leu 165 170 175 Ala Ile Arg Arg Arg SerAla Pro Leu Val Gly Gln Val Lys Thr Tyr 180 185 190 Gln Asp Phe Val ArgGln Val Phe Thr Gly Lys Gly Asn Gly Leu Lys 195 200 205 Glu Ile Leu ArgArg Thr Gly Arg Ile Ser Gln Arg Asp Leu Ala Thr 210 215 220 Trp Leu ArgArg Asn Glu Ile Ser Pro His Ala Leu Pro Lys Asp Leu 225 230 235 240 LysPro Gly Gln Trp Ala Ser Leu Trp Glu Leu Thr Gly Gly Thr Ala 245 250 255Asp Gly Ser Phe Asp Gly Thr Ala Gly Gly Gly Ala Ala Gly Ser His 260 265270 Gly Ala Ala Arg Val Gly Ala Gly His Pro Gly Gly Arg Val Ser Ala 275280 285 Ser Arg Arg Gly Val Pro Gln Ala Arg Arg Gly Arg Gly His Ala Val290 295 300 Arg Ser Ser Thr Gly Thr Glu Pro Arg Trp Gly Arg Gly Arg AlaGlu 305 310 315 320 Ser Ala 30 13842 DNA Streptomyces venezuelae 30atgtcttcag ccggaattac caggaccggt gcgagaacac cggtgacagg gcgtggggcg 60gcagcgtggg acacggggga agtgcgggtc cgacgggggt tgccccctgc cggccccgat 120catgcggagc actccttctc tcgtgctcct accggtgatg tgcgcgccga attgattcgt 180ggagagatgt cgacagtgtc caagagtgag tccgaggaat tcgtgtccgt gtcgaacgac 240gccggttccg cgcacggcac agcggaaccc gtcgccgtcg tcggcatctc ctgccgggtg 300cccggcgccc gggacccgag agagttctgg gaactcctgg cggcaggcgg ccaggccgtc 360accgacgtcc ccgcggaccg ctggaacgcc ggcgacttct acgacccgga ccgctccgcc 420cccggccgct cgaacagccg gtggggcggg ttcatcgagg acgtcgaccg gttcgacgcc 480gccttcttcg gcatctcgcc ccgcgaggcc gcggagatgg acccgcagca gcggctcgcc 540ctggagctgg gctgggaggc cctggagcgc gccgggatcg acccgtcctc gctcaccggc 600acccgcaccg gcgtcttcgc cggcgccatc tgggacgact acgccaccct gaagcaccgc 660cagggcggcg ccgcgatcac cccgcacacc gtcaccggcc tccaccgcgg catcatcgcg 720aaccgactct cgtacacgct cgggctccgc ggccccagca tggtcgtcga ctccggccag 780tcctcgtcgc tcgtcgccgt ccacctcgcg tgcgagagcc tgcggcgcgg cgagtccgag 840ctcgccctcg ccggcggcgt ctcgctcaac ctggtgccgg acagcatcat cggggcgagc 900aagttcggcg gcctctcccc cgacggccgc gcctacacct tcgacgcgcg cgccaacggc 960tacgtacgcg gcgagggcgg cggtttcgtc gtcctgaagc gcctctcccg ggccgtcgcc 1020gacggcgacc cggtgctcgc cgtgatccgg ggcagcgccg tcaacaacgg cggcgccgcc 1080cagggcatga cgacccccga cgcgcaggcg caggaggccg tgctccgcga ggcccacgag 1140cgggccggga ccgcgccggc cgacgtgcgg tacgtcgagc tgcacggcac cggcaccccc 1200gtgggcgacc cgatcgaggc cgctgcgctc ggcgccgccc tcggcaccgg ccgcccggcc 1260ggacagccgc tcctggtcgg ctcggtcaag acgaacatcg gccacctgga gggcgcggcc 1320ggcatcgccg gcctcatcaa ggccgtcctg gcggtccgcg gtcgcgcgct gcccgccagc 1380ctgaactacg agaccccgaa cccggcgatc ccgttcgagg aactgaacct ccgggtgaac 1440acggagtacc tgccgtggga gccggagcac gacgggcagc ggatggtcgt cggcgtgtcc 1500tcgttcggca tgggcggcac gaacgcgcat gtcgtgctcg aagaggcccc cgggggttgt 1560cgaggtgctt cggtcgtgga gtcgacggtc ggcgggtcgg cggtcggcgg cggtgtggtg 1620ccgtgggtgg tgtcggcgaa gtccgctgcc gcgctggacg cgcagatcga gcggcttgcc 1680gcgttcgcct cgcgggatcg tacggatggt gtcgacgcgg gcgctgtcga tgcgggtgct 1740gtcgatgcgg gtgctgtcgc tcgcgtactg gccggcgggc gtgctcagtt cgagcaccgg 1800gccgtcgtcg tcggcagcgg gccggacgat ctggcggcag cgctggccgc gcctgagggt 1860ctggtccggg gcgtggcttc cggtgtcggg cgagtggcgt tcgtgttccc cgggcagggc 1920acgcagtggg ccggcatggg tgccgaactg ctggactctt ccgcggtgtt cgcggcggcc 1980atggccgaat gcgaggccgc actctccccg tacgtcgact ggtcgctgga ggccgtcgta 2040cggcaggccc ccggtgcgcc cacgctggag cgggtcgatg tcgtgcagcc tgtgacgttc 2100gccgtcatgg tctcgctggc tcgcgtgtgg cagcaccacg gggtgacgcc ccaggcggtc 2160gtcggccact cgcagggcga gatcgccgcc gcgtacgtcg ccggtgccct gagcctggac 2220gacgccgctc gtgtcgtgac cctgcgcagc aagtccatcg ccgcccacct cgccggcaag 2280ggcggcatgc tgtccctcgc gctgagcgag gacgccgtcc tggagcgact ggccgggttc 2340gacgggctgt ccgtcgccgc tgtgaacggg cccaccgcca ccgtggtctc cggtgacccc 2400gtacagatcg aagagcttgc tcgggcgtgt gaggccgatg gggtccgtgc gcgggtcatt 2460cccgtcgact acgcgtccca cagccggcag gtcgagatca tcgagagcga gctcgccgag 2520gtcctcgccg ggctcagccc gcaggctccg cgcgtgccgt tcttctcgac actcgaaggc 2580gcctggatca ccgagcccgt gctcgacggc ggctactggt accgcaacct gcgccatcgt 2640gtgggcttcg ccccggccgt cgagaccctg gccaccgacg agggcttcac ccacttcgtc 2700gaggtcagcg cccaccccgt cctcaccatg gccctccccg ggaccgtcac cggtctggcg 2760accctgcgtc gcgacaacgg cggtcaggac cgcctagtcg cctccctcgc cgaagcatgg 2820gccaacggac tcgcggtcga ctggagcccg ctcctcccct ccgcgaccgg ccaccactcc 2880gacctcccca cctacgcgtt ccagaccgag cgccactggc tgggcgagat cgaggcgctc 2940gccccggcgg gcgagccggc ggtgcagccc gccgtcctcc gcacggaggc ggccgagccg 3000gcggagctcg accgggacga gcagctgcgc gtgatcctgg acaaggtccg ggcgcagacg 3060gcccaggtgc tggggtacgc gacaggcggg cagatcgagg tcgaccggac cttccgtgag 3120gccggttgca cctccctgac cggcgtggac ctgcgcaacc ggatcaacgc cgccttcggc 3180gtacggatgg cgccgtccat gatcttcgac ttccccaccc ccgaggctct cgcggagcag 3240ctgctcctcg tcgtgcacgg ggaggcggcg gcgaacccgg ccggtgcgga gccggctccg 3300gtggcggcgg ccggtgccgt cgacgagccg gtggcgatcg tcggcatggc ctgccgcctg 3360cccggtgggg tcgcctcgcc ggaggacctg tggcggctgg tggccggcgg cggggacgcg 3420atctcggagt tcccgcagga ccgcggctgg gacgtggagg ggctgtacca cccggatccg 3480gagcaccccg gcacgtcgta cgtccgccag ggcggtttca tcgagaacgt cgccggcttc 3540gacgcggcct tcttcgggat ctcgccgcgc gaggccctcg ccatggaccc gcagcagcgg 3600ctcctcctcg aaacctcctg ggaggccgtc gaggacgccg ggatcgaccc gacctccctg 3660cggggacggc aggtcggcgt cttcactggg gcgatgaccc acgagtacgg gccgagcctg 3720cgggacggcg gggaaggcct cgacggctac ctgctgaccg gcaacacggc cagcgtgatg 3780tcgggccgcg tctcgtacac actcggcctt gagggccccg ccctgacggt ggacacggcc 3840tgctcgtcgt cgctggtcgc cctgcacctc gccgtgcagg ccctgcgcaa gggcgaggtc 3900gacatggcgc tcgccggcgg cgtggccgtg atgcccacgc ccgggatgtt cgtcgagttc 3960agccggcagc gcgggctggc cggggacggc cggtcgaagg cgttcgccgc gtcggcggac 4020ggcaccagct ggtccgaggg cgtcggcgtc ctcctcgtcg agcgcctgtc ggacgcccgc 4080cgcaacggac accaggtcct cgcggtcgtc cgcggcagcg ccttgaacca ggacggcgcg 4140agcaacggcc tcacggctcc gaacgggccc tcgcagcagc gcgtcatccg gcgcgcgctg 4200gcggacgccc ggctgacgac ctccgacgtg gacgtcgtcg aggcacacgg cacgggcacg 4260cgactcggcg acccgatcga ggcgcaggcc ctgatcgcca cctacggcca gggccgtgac 4320gacgaacagc cgctgcgcct cgggtcgttg aagtccaaca tcgggcacac ccaggccgcg 4380gccggcgtct ccggtgtcat caagatggtc caggcgatgc gccacggact gctgccgaag 4440acgctgcacg tcgacgagcc ctcggaccag atcgactggt cggctggcgc cgtggaactc 4500ctcaccgagg ccgtcgactg gccggagaag caggacggcg ggctgcgccg ggccgccgtc 4560tcctccttcg ggatcagcgg caccaatgcg catgtggtgc tcgaagaggc cccggtggtt 4620gtcgagggtg cttcggtcgt cgagccgtcg gttggcgggt cggcggtcgg cggcggtgtg 4680acgccttggg tggtgtcggc gaagtccgct gccgcgctcg acgcgcagat cgagcggctt 4740gccgcattcg cctcgcggga tcgtacggat gacgccgacg ccggtgctgt cgacgcgggc 4800gctgtcgctc acgtactggc tgacgggcgt gctcagttcg agcaccgggc cgtcgcgctc 4860ggcgccgggg cggacgacct cgtacaggcg ctggccgatc cggacgggct gatacgcgga 4920acggcttccg gtgtcgggcg agtggcgttc gtgttccccg gtcagggcac gcagtgggct 4980ggcatgggtg ccgaactgct ggactcttcc gcggtgttcg cggcggccat ggccgagtgt 5040gaggccgcgc tgtccccgta cgtcgactgg tcgctggagg ccgtcgtacg gcaggccccc 5100ggtgcgccca cgctggagcg ggtcgatgtc gtgcagcctg tgacgttcgc cgtcatggtc 5160tcgctggctc gcgtgtggca gcaccacggt gtgacgcccc aggcggtcgt cggccactcg 5220cagggcgaga tcgccgccgc gtacgtcgcc ggagccctgc ccctggacga cgccgcccgc 5280gtcgtcaccc tgcgcagcaa gtccatcgcc gcccacctcg ccggcaaggg cggcatgctg 5340tccctcgcgc tgaacgagga cgccgtcctg gagcgactga gtgacttcga cgggctgtcc 5400gtcgccgccg tcaacgggcc caccgccact gtcgtgtcgg gtgaccccgt acagatcgaa 5460gagcttgctc aggcgtgcaa ggcggacgga ttccgcgcgc ggatcattcc cgtcgactac 5520gcgtcccaca gccggcaggt cgagatcatc gagagcgagc tcgcccaggt cctcgccggt 5580ctcagcccgc aggccccgcg cgtgccgttc ttctcgacgc tcgaaggcac ctggatcacc 5640gagcccgtcc tcgacggcac ctactggtac cgcaacctcc gtcaccgcgt cggcttcgcc 5700cccgccatcg agaccctggc cgtcgacgag ggcttcacgc acttcgtcga ggtcagcgcc 5760caccccgtcc tcaccatgac cctccccgag accgtcaccg gcctcggcac cctccgtcgc 5820gaacagggag gccaagagcg tctggtcacc tcgctcgccg aggcgtgggt caacgggctt 5880cccgtggcat ggacttcgct cctgcccgcc acggcctccc gccccggtct gcccacctac 5940gccttccagg ccgagcgcta ctggctcgag aacactcccg ccgccctggc caccggcgac 6000gactggcgct accgcatcga ctggaagcgc ctcccggccg ccgaggggtc cgagcgcacc 6060ggcctgtccg gccgctggct cgccgtcacg ccggaggacc actccgcgca ggccgccgcc 6120gtgctcaccg cgctggtcga cgccggggcg aaggtcgagg tgctgacggc cggggcggac 6180gacgaccgtg aggccctcgc cgcccggctc accgcactga cgaccggtga cggcttcacc 6240ggcgtggtct cgctcctcga cggactcgta ccgcaggtcg cctgggtcca ggcgctcggc 6300gacgccggaa tcaaggcgcc cctgtggtcc gtcacccagg gcgcggtctc cgtcggacgt 6360ctcgacaccc ccgccgaccc cgaccgggcc atgctctggg gcctcggccg cgtcgtcgcc 6420cttgagcacc ccgaacgctg ggccggcctc gtcgacctcc ccgcccagcc cgatgccgcc 6480gccctcgccc acctcgtcac cgcactctcc ggcgccaccg gcgaggacca gatcgccatc 6540cgcaccaccg gactccacgc ccgccgcctc gcccgcgcac ccctccacgg acgtcggccc 6600acccgcgact ggcagcccca cggcaccgtc ctcatcaccg gcggcaccgg agccctcggc 6660agccacgccg cacgctggat ggcccaccac ggagccgaac acctcctcct cgtcagccgc 6720agcggcgaac aagcccccgg agccacccaa ctcaccgccg aactcaccgc atcgggcgcc 6780cgcgtcacca tcgccgcctg cgacgtcgcc gacccccacg ccatgcgcac cctcctcgac 6840gccatccccg ccgagacgcc cctcaccgcc gtcgtccaca ccgccggcgc gctcgacgac 6900ggcatcgtgg acacgctgac cgccgagcag gtccggcggg cccaccgtgc gaaggccgtc 6960ggcgcctcgg tgctcgacga gctgacccgg gacctcgacc tcgacgcgtt cgtgctcttc 7020tcgtccgtgt cgagcactct gggcatcccc ggtcagggca actacgcccc gcacaacgcc 7080tacctcgacg ccctcgcggc tcgccgccgg gccaccggcc ggtccgccgt ctcggtggcc 7140tggggaccgt gggacggtgg cggcatggcc gccggtgacg gcgtggccga gcggctgcgc 7200aaccacggcg tgcccggcat ggacccggaa ctcgccctgg ccgcactgga gtccgcgctc 7260ggccgggacg agaccgcgat caccgtcgcg gacatcgact gggaccgctt ctacctcgcg 7320tactcctccg gtcgcccgca gcccctcgtc gaggagctgc ccgaggtgcg gcgcatcatc 7380gacgcacggg acagcgccac gtccggacag ggcgggagct ccgcccaggg cgccaacccc 7440ctggccgagc ggctggccgc cgcggctccc ggcgagcgta cggagatcct cctcggtctc 7500gtacgggcgc aggccgccgc cgtgctccgg atgcgttcgc cggaggacgt cgccgccgac 7560cgcgccttca aggacatcgg cttcgactcg ctcgccggtg tcgagctgcg caacaggctg 7620acccgggcga ccgggctcca gctgcccgcg acgctcgtct tcgaccaccc gacgccgctg 7680gccctcgtgt cgctgctccg cagcgagttc ctcggtgacg aggagacggc ggacgcccgg 7740cggtccgcgg cgctgcccgc gactgtcggt gccggtgccg gcgccggcgc cggcaccgat 7800gccgacgacg atccgatcgc gatcgtcgcg atgagctgcc gctaccccgg tgacatccgc 7860agcccggagg acctgtggcg gatgctgtcc gagggcggcg agggcatcac gccgttcccc 7920accgaccgcg gctgggacct cgacggcctg tacgacgccg acccggacgc gctcggcagg 7980gcgtacgtcc gcgagggcgg gttcctgcac gacgcggccg agttcgacgc ggagttcttc 8040ggcgtctcgc cgcgcgaggc gctggccatg gacccgcagc agcggatgct cctgacgacg 8100tcctgggagg ccttcgagcg ggccggcatc gagccggcat cgctgcgcgg cagcagcacc 8160ggtgtcttca tcggcctctc ctaccaggac tacgcggccc gcgtcccgaa cgccccgcgt 8220ggcgtggagg gttacctgct gaccggcagc acgccgagcg tcgcgtcggg ccgtatcgcg 8280tacaccttcg gtctcgaagg gcccgcgacg accgtcgaca ccgcctgctc gtcgtcgctg 8340accgccctgc acctggcggt gcgggcgctg cgcagcggcg agtgcacgat ggcgctcgcc 8400ggtggcgtgg cgatgatggc gaccccgcac atgttcgtgg agttcagccg tcagcgggcg 8460ctcgccccgg acggccgcag caaggccttc tcggcggacg ccgacgggtt cggcgccgcg 8520gagggcgtcg gcctgctgct cgtggagcgg ctctcggacg cgcggcgcaa cggtcacccg 8580gtgctcgccg tggtccgcgg taccgccgtc aaccaggacg gcgccagcaa cgggctgacc 8640gcgcccaacg gaccctcgca gcagcgggtg atccggcagg cgctcgccga cgcccggctg 8700gcacccggcg acatcgacgc cgtcgagacg cacggcacgg gaacctcgct gggcgacccc 8760atcgaggccc agggcctcca ggccacgtac ggcaaggagc ggcccgcgga acggccgctc 8820gccatcggct ccgtgaagtc caacatcgga cacacccagg ccgcggccgg tgcggcgggc 8880atcatcaaga tggtcctcgc gatgcgccac ggcaccctgc cgaagaccct ccacgccgac 8940gagccgagcc cgcacgtcga ctgggcgaac agcggcctgg ccctcgtcac cgagccgatc 9000gactggccgg ccggcaccgg tccgcgccgc gccgccgtct cctccttcgg catcagcggg 9060acgaacgcgc acgtcgtgct ggagcaggcg ccggatgctg ctggtgaggt gcttggggcc 9120gatgaggtgc ctgaggtgtc tgagacggta gcgatggctg ggacggctgg gacctccgag 9180gtcgctgagg gctctgaggc ctccgaggcc cccgcggccc ccggcagccg tgaggcgtcc 9240ctccccgggc acctgccctg ggtgctgtcc gccaaggacg agcagtcgct gcgcggccag 9300gccgccgccc tgcacgcgtg gctgtccgag cccgccgccg acctgtcgga cgcggacgga 9360ccggcccgcc tgcgggacgt cgggtacacg ctcgccacga gccgtaccgc cttcgcgcac 9420cgcgccgccg tgaccgccgc cgaccgggac gggttcctgg acgggctggc cacgctggcc 9480cagggcggca cctcggccca cgtccacctg gacaccgccc gggacggcac caccgcgttc 9540ctcttcaccg gccagggcag tcagcgcccc ggcgccggcc gtgagctgta cgaccggcac 9600cccgtcttcg cccgggcgct cgacgagatc tgcgcccacc tcgacggtca cctcgaactg 9660cccctgctcg acgtgatgtt cgcggccgag ggcagcgcgg aggccgcgct gctcgacgag 9720acgcggtaca cgcagtgcgc gctgttcgcc ctggaggtcg cgctcttccg gctcgtcgag 9780agctggggca tgcggccggc cgcactgctc ggtcactcgg tcggcgagat cgccgccgcg 9840cacgtcgccg gtgtgttctc gctcgccgac gccgcccgcc tggtcgccgc gcgcggccgg 9900ctcatgcagg agctgcccgc cggtggcgcg atgctcgccg tccaggccgc ggaggacgag 9960atccgcgtgt ggctggagac ggaggagcgg tacgcgggac gtctggacgt cgccgccgtc 10020aacggccccg aggccgccgt cctgtccggc gacgcggacg cggcgcggga ggcggaggcg 10080tactggtccg ggctcggccg caggacccgc gcgctgcggg tcagccacgc cttccactcc 10140gcgcacatgg acggcatgct cgacgggttc cgcgccgtcc tggagacggt ggagttccgg 10200cgcccctccc tgaccgtggt ctcgaacgtc accggcctgg ccgccggccc ggacgacctg 10260tgcgaccccg agtactgggt ccggcacgtc cgcggcaccg tccgcttcct cgacggcgtc 10320cgtgtcctgc gcgacctcgg cgtgcggacc tgcctggagc tgggccccga cggggtcctc 10380accgccatgg cggccgacgg cctcgcggac acccccgcgg attccgctgc cggctccccc 10440gtcggctctc ccgccggctc tcccgccgac tccgccgccg gcgcgctccg gccccggccg 10500ctgctcgtgg cgctgctgcg ccgcaagcgg tcggagaccg agaccgtcgc ggacgccctc 10560ggcagggcgc acgcccacgg caccggaccc gactggcacg cctggttcgc cggctccggg 10620gcgcaccgcg tggacctgcc cacgtactcc ttccggcgcg accgctactg gctggacgcc 10680ccggcggccg acaccgcggt ggacaccgcc ggcctcggtc tcggcaccgc cgaccacccg 10740ctgctcggcg ccgtggtcag ccttccggac cgggacggcc tgctgctcac cggccgcctc 10800tccctgcgca cccacccgtg gctcgcggac cacgccgtcc tggggagcgt cctgctcccc 10860ggcgccgcga tggtcgaact cgccgcgcac gctgcggagt ccgccggtct gcgtgacgtg 10920cgggagctga ccctccttga accgctggta ctgcccgagc acggtggcgt cgagctgcgc 10980gtgacggtcg gggcgccggc cggagagccc ggtggcgagt cggccgggga cggcgcacgg 11040cccgtctccc tccactcgcg gctcgccgac gcgcccgccg gtaccgcctg gtcctgccac 11100gcgaccggtc tgctggccac cgaccggccc gagcttcccg tcgcgcccga ccgtgcggcc 11160atgtggccgc cgcagggcgc cgaggaggtg ccgctcgacg gtctctacga gcggctcgac 11220gggaacggcc tcgccttcgg tccgctgttc caggggctga acgcggtgtg gcggtacgag 11280ggtgaggtct tcgccgacat cgcgctcccc gccaccacga atgcgaccgc gcccgcgacc 11340gcgaacggcg gcgggagtgc ggcggcggcc ccctacggca tccaccccgc cctgctcgac 11400gcttcgctgc acgccatcgc ggtcggcggt ctcgtcgacg agcccgagct cgtccgcgtc 11460cccttccact ggagcggtgt caccgtgcac gcggccggtg ccgcggcggc ccgggtccgt 11520ctcgcctccg cggggacgga cgccgtctcg ctgtccctga cggacggcga gggacgcccg 11580ctggtctccg tggaacggct cacgctgcgc ccggtcaccg ccgatcaggc ggcggcgagc 11640cgcgtcggcg ggctgatgca ccgggtggcc tggcgtccgt acgccctcgc ctcgtccggc 11700gaacaggacc cgcacgccac ttcgtacggg ccgaccgccg tcctcggcaa ggacgagctg 11760aaggtcgccg ccgccctgga gtccgcgggc gtcgaagtcg ggctctaccc cgacctggcc 11820gcgctgtccc aggacgtggc ggccggcgcc ccggcgcccc gtaccgtcct tgcgccgctg 11880cccgcgggtc ccgccgacgg cggcgcggag ggtgtacggg gcacggtggc ccggacgctg 11940gagctgctcc aggcctggct ggccgacgag cacctcgcgg gcacccgcct gctcctggtc 12000acccgcggtg cggtgcggga ccccgagggg tccggcgccg acgatggcgg cgaggacctg 12060tcgcacgcgg ccgcctgggg tctcgtacgg accgcgcaga ccgagaaccc cggccgcttc 12120ggccttctcg acctggccga cgacgcctcg tcgtaccgga ccctgccgtc ggtgctctcc 12180gacgcgggcc tgcgcgacga accgcagctc gccctgcacg acggcaccat caggctggcc 12240cgcctggcct ccgtccggcc cgagaccggc accgccgcac cggcgctcgc cccggagggc 12300acggtcctgc tgaccggcgg caccggcggc ctgggcggac tggtcgcccg gcacgtggtg 12360ggcgagtggg gcgtacgacg cctgctgctg gtgagccggc ggggcacgga cgccccgggc 12420gccgacgagc tcgtgcacga gctggaggcc ctgggagccg acgtctcggt ggccgcgtgc 12480gacgtcgccg accgcgaagc cctcaccgcc gtactcgacg ccatccccgc cgaacacccg 12540ctcaccgcgg tcgtccacac ggcaggcgtc ctctccgacg gcaccctccc gtccatgacg 12600acggaggacg tggaacacgt actgcggccc aaggtcgacg ccgcgttcct cctcgacgaa 12660ctcacctcga cgcccgcata cgacctggca gcgttcgtca tgttctcctc cgccgccgcc 12720gtcttcggtg gcgcggggca gggcgcctac gccgccgcca acgccaccct cgacgccctc 12780gcctggcgcc gccgggcagc cggactcccc gccctctccc tcggctgggg cctctgggcc 12840gagaccagcg gcatgaccgg cgagctcggc caggcggacc tgcgccggat gagccgcgcg 12900ggcatcggcg ggatcagcga cgccgagggc atcgcgctcc tcgacgccgc cctccgcgac 12960gaccgccacc cggtcctgct gcccctgcgg ctcgacgccg ccgggctgcg ggacgcggcc 13020gggaacgacc cggccggaat cccggcgctc ttccgggacg tcgtcggcgc caggaccgtc 13080cgggcccggc cgtccgcggc ctccgcctcg acgacagccg ggacggccgg cacgccgggg 13140acggcggacg gcgcggcgga aacggcggcg gtcacgctcg ccgaccgggc cgccaccgtg 13200gacgggcccg cacggcagcg cctgctgctc gagttcgtcg tcggcgaggt cgccgaagta 13260ctcggccacg cccgcggtca ccggatcgac gccgaacggg gcttcctcga cctcggcttc 13320gactccctga ccgccgtcga actccgcaac cggctcaact ccgccggtgg cctcgccctc 13380ccggcgaccc tggtcttcga ccacccaagc ccggcggcac tcgcctccca cctggacgcc 13440gagctgccgc gcggcgcctc ggaccaggac ggagccggga accggaacgg gaacgagaac 13500gggacgacgg cgtcccggag caccgccgag acggacgcgc tgctggcaca actgacccgc 13560ctggaaggcg ccttggtgct gacgggcctc tcggacgccc ccgggagcga agaagtcctg 13620gagcacctgc ggtccctgcg ctcgatggtc acgggcgaga ccgggaccgg gaccgcgtcc 13680ggagccccgg acggcgccgg gtccggcgcc gaggaccggc cctgggcggc cggggacgga 13740gccgggggcg ggagtgagga cggcgcggga gtgccggact tcatgaacgc ctcggccgag 13800gaactcttcg gcctcctcga ccaggacccc agcacggact ga 13842 31 4613 PRTStreptomyces venezuelae 31 Met Ser Ser Ala Gly Ile Thr Arg Thr Gly AlaArg Thr Pro Val Thr 1 5 10 15 Gly Arg Gly Ala Ala Ala Trp Asp Thr GlyGlu Val Arg Val Arg Arg 20 25 30 Gly Leu Pro Pro Ala Gly Pro Asp His AlaGlu His Ser Phe Ser Arg 35 40 45 Ala Pro Thr Gly Asp Val Arg Ala Glu LeuIle Arg Gly Glu Met Ser 50 55 60 Thr Val Ser Lys Ser Glu Ser Glu Glu PheVal Ser Val Ser Asn Asp 65 70 75 80 Ala Gly Ser Ala His Gly Thr Ala GluPro Val Ala Val Val Gly Ile 85 90 95 Ser Cys Arg Val Pro Gly Ala Arg AspPro Arg Glu Phe Trp Glu Leu 100 105 110 Leu Ala Ala Gly Gly Gln Ala ValThr Asp Val Pro Ala Asp Arg Trp 115 120 125 Asn Ala Gly Asp Phe Tyr AspPro Asp Arg Ser Ala Pro Gly Arg Ser 130 135 140 Asn Ser Arg Trp Gly GlyPhe Ile Glu Asp Val Asp Arg Phe Asp Ala 145 150 155 160 Ala Phe Phe GlyIle Ser Pro Arg Glu Ala Ala Glu Met Asp Pro Gln 165 170 175 Gln Arg LeuAla Leu Glu Leu Gly Trp Glu Ala Leu Glu Arg Ala Gly 180 185 190 Ile AspPro Ser Ser Leu Thr Gly Thr Arg Thr Gly Val Phe Ala Gly 195 200 205 AlaIle Trp Asp Asp Tyr Ala Thr Leu Lys His Arg Gln Gly Gly Ala 210 215 220Ala Ile Thr Pro His Thr Val Thr Gly Leu His Arg Gly Ile Ile Ala 225 230235 240 Asn Arg Leu Ser Tyr Thr Leu Gly Leu Arg Gly Pro Ser Met Val Val245 250 255 Asp Ser Gly Gln Ser Ser Ser Leu Val Ala Val His Leu Ala CysGlu 260 265 270 Ser Leu Arg Arg Gly Glu Ser Glu Leu Ala Leu Ala Gly GlyVal Ser 275 280 285 Leu Asn Leu Val Pro Asp Ser Ile Ile Gly Ala Ser LysPhe Gly Gly 290 295 300 Leu Ser Pro Asp Gly Arg Ala Tyr Thr Phe Asp AlaArg Ala Asn Gly 305 310 315 320 Tyr Val Arg Gly Glu Gly Gly Gly Phe ValVal Leu Lys Arg Leu Ser 325 330 335 Arg Ala Val Ala Asp Gly Asp Pro ValLeu Ala Val Ile Arg Gly Ser 340 345 350 Ala Val Asn Asn Gly Gly Ala AlaGln Gly Met Thr Thr Pro Asp Ala 355 360 365 Gln Ala Gln Glu Ala Val LeuArg Glu Ala His Glu Arg Ala Gly Thr 370 375 380 Ala Pro Ala Asp Val ArgTyr Val Glu Leu His Gly Thr Gly Thr Pro 385 390 395 400 Val Gly Asp ProIle Glu Ala Ala Ala Leu Gly Ala Ala Leu Gly Thr 405 410 415 Gly Arg ProAla Gly Gln Pro Leu Leu Val Gly Ser Val Lys Thr Asn 420 425 430 Ile GlyHis Leu Glu Gly Ala Ala Gly Ile Ala Gly Leu Ile Lys Ala 435 440 445 ValLeu Ala Val Arg Gly Arg Ala Leu Pro Ala Ser Leu Asn Tyr Glu 450 455 460Thr Pro Asn Pro Ala Ile Pro Phe Glu Glu Leu Asn Leu Arg Val Asn 465 470475 480 Thr Glu Tyr Leu Pro Trp Glu Pro Glu His Asp Gly Gln Arg Met Val485 490 495 Val Gly Val Ser Ser Phe Gly Met Gly Gly Thr Asn Ala His ValVal 500 505 510 Leu Glu Glu Ala Pro Gly Gly Cys Arg Gly Ala Ser Val ValGlu Ser 515 520 525 Thr Val Gly Gly Ser Ala Val Gly Gly Gly Val Val ProTrp Val Val 530 535 540 Ser Ala Lys Ser Ala Ala Ala Leu Asp Ala Gln IleGlu Arg Leu Ala 545 550 555 560 Ala Phe Ala Ser Arg Asp Arg Thr Asp GlyVal Asp Ala Gly Ala Val 565 570 575 Asp Ala Gly Ala Val Asp Ala Gly AlaVal Ala Arg Val Leu Ala Gly 580 585 590 Gly Arg Ala Gln Phe Glu His ArgAla Val Val Val Gly Ser Gly Pro 595 600 605 Asp Asp Leu Ala Ala Ala LeuAla Ala Pro Glu Gly Leu Val Arg Gly 610 615 620 Val Ala Ser Gly Val GlyArg Val Ala Phe Val Phe Pro Gly Gln Gly 625 630 635 640 Thr Gln Trp AlaGly Met Gly Ala Glu Leu Leu Asp Ser Ser Ala Val 645 650 655 Phe Ala AlaAla Met Ala Glu Cys Glu Ala Ala Leu Ser Pro Tyr Val 660 665 670 Asp TrpSer Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr 675 680 685 LeuGlu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met Val 690 695 700Ser Leu Ala Arg Val Trp Gln His His Gly Val Thr Pro Gln Ala Val 705 710715 720 Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala725 730 735 Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser LysSer 740 745 750 Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser LeuAla Leu 755 760 765 Ser Glu Asp Ala Val Leu Glu Arg Leu Ala Gly Phe AspGly Leu Ser 770 775 780 Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val ValSer Gly Asp Pro 785 790 795 800 Val Gln Ile Glu Glu Leu Ala Arg Ala CysGlu Ala Asp Gly Val Arg 805 810 815 Ala Arg Val Ile Pro Val Asp Tyr AlaSer His Ser Arg Gln Val Glu 820 825 830 Ile Ile Glu Ser Glu Leu Ala GluVal Leu Ala Gly Leu Ser Pro Gln 835 840 845 Ala Pro Arg Val Pro Phe PheSer Thr Leu Glu Gly Ala Trp Ile Thr 850 855 860 Glu Pro Val Leu Asp GlyGly Tyr Trp Tyr Arg Asn Leu Arg His Arg 865 870 875 880 Val Gly Phe AlaPro Ala Val Glu Thr Leu Ala Thr Asp Glu Gly Phe 885 890 895 Thr His PheVal Glu Val Ser Ala His Pro Val Leu Thr Met Ala Leu 900 905 910 Pro GlyThr Val Thr Gly Leu Ala Thr Leu Arg Arg Asp Asn Gly Gly 915 920 925 GlnAsp Arg Leu Val Ala Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu 930 935 940Ala Val Asp Trp Ser Pro Leu Leu Pro Ser Ala Thr Gly His His Ser 945 950955 960 Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg His Trp Leu Gly Glu965 970 975 Ile Glu Ala Leu Ala Pro Ala Gly Glu Pro Ala Val Gln Pro AlaVal 980 985 990 Leu Arg Thr Glu Ala Ala Glu Pro Ala Glu Leu Asp Arg AspGlu Gln 995 1000 1005 Leu Arg Val Ile Leu Asp Lys Val Arg Ala Gln ThrAla Gln Val Leu 1010 1015 1020 Gly Tyr Ala Thr Gly Gly Gln Ile Glu ValAsp Arg Thr Phe Arg Glu 1025 1030 1035 1040 Ala Gly Cys Thr Ser Leu ThrGly Val Asp Leu Arg Asn Arg Ile Asn 1045 1050 1055 Ala Ala Phe Gly ValArg Met Ala Pro Ser Met Ile Phe Asp Phe Pro 1060 1065 1070 Thr Pro GluAla Leu Ala Glu Gln Leu Leu Leu Val Val His Gly Glu 1075 1080 1085 AlaAla Ala Asn Pro Ala Gly Ala Glu Pro Ala Pro Val Ala Ala Ala 1090 10951100 Gly Ala Val Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu1105 1110 1115 1120 Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg LeuVal Ala Gly 1125 1130 1135 Gly Gly Asp Ala Ile Ser Glu Phe Pro Gln AspArg Gly Trp Asp Val 1140 1145 1150 Glu Gly Leu Tyr His Pro Asp Pro GluHis Pro Gly Thr Ser Tyr Val 1155 1160 1165 Arg Gln Gly Gly Phe Ile GluAsn Val Ala Gly Phe Asp Ala Ala Phe 1170 1175 1180 Phe Gly Ile Ser ProArg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg 1185 1190 1195 1200 Leu LeuLeu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp 1205 1210 1215Pro Thr Ser Leu Arg Gly Arg Gln Val Gly Val Phe Thr Gly Ala Met 12201225 1230 Thr His Glu Tyr Gly Pro Ser Leu Arg Asp Gly Gly Glu Gly LeuAsp 1235 1240 1245 Gly Tyr Leu Leu Thr Gly Asn Thr Ala Ser Val Met SerGly Arg Val 1250 1255 1260 Ser Tyr Thr Leu Gly Leu Glu Gly Pro Ala LeuThr Val Asp Thr Ala 1265 1270 1275 1280 Cys Ser Ser Ser Leu Val Ala LeuHis Leu Ala Val Gln Ala Leu Arg 1285 1290 1295 Lys Gly Glu Val Asp MetAla Leu Ala Gly Gly Val Ala Val Met Pro 1300 1305 1310 Thr Pro Gly MetPhe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Gly 1315 1320 1325 Asp GlyArg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Thr Ser Trp 1330 1335 1340Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg 13451350 1355 1360 Arg Asn Gly His Gln Val Leu Ala Val Val Arg Gly Ser AlaLeu Asn 1365 1370 1375 Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro AsnGly Pro Ser Gln 1380 1385 1390 Gln Arg Val Ile Arg Arg Ala Leu Ala AspAla Arg Leu Thr Thr Ser 1395 1400 1405 Asp Val Asp Val Val Glu Ala HisGly Thr Gly Thr Arg Leu Gly Asp 1410 1415 1420 Pro Ile Glu Ala Gln AlaLeu Ile Ala Thr Tyr Gly Gln Gly Arg Asp 1425 1430 1435 1440 Asp Glu GlnPro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His 1445 1450 1455 ThrGln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val Gln Ala 1460 14651470 Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu Pro Ser1475 1480 1485 Asp Gln Ile Asp Trp Ser Ala Gly Ala Val Glu Leu Leu ThrGlu Ala 1490 1495 1500 Val Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu ArgArg Ala Ala Val 1505 1510 1515 1520 Ser Ser Phe Gly Ile Ser Gly Thr AsnAla His Val Val Leu Glu Glu 1525 1530 1535 Ala Pro Val Val Val Glu GlyAla Ser Val Val Glu Pro Ser Val Gly 1540 1545 1550 Gly Ser Ala Val GlyGly Gly Val Thr Pro Trp Val Val Ser Ala Lys 1555 1560 1565 Ser Ala AlaAla Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala Phe Ala 1570 1575 1580 SerArg Asp Arg Thr Asp Asp Ala Asp Ala Gly Ala Val Asp Ala Gly 1585 15901595 1600 Ala Val Ala His Val Leu Ala Asp Gly Arg Ala Gln Phe Glu HisArg 1605 1610 1615 Ala Val Ala Leu Gly Ala Gly Ala Asp Asp Leu Val GlnAla Leu Ala 1620 1625 1630 Asp Pro Asp Gly Leu Ile Arg Gly Thr Ala SerGly Val Gly Arg Val 1635 1640 1645 Ala Phe Val Phe Pro Gly Gln Gly ThrGln Trp Ala Gly Met Gly Ala 1650 1655 1660 Glu Leu Leu Asp Ser Ser AlaVal Phe Ala Ala Ala Met Ala Glu Cys 1665 1670 1675 1680 Glu Ala Ala LeuSer Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val 1685 1690 1695 Arg GlnAla Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln 1700 1705 1710Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Arg Val Trp Gln His 17151720 1725 His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly GluIle 1730 1735 1740 Ala Ala Ala Tyr Val Ala Gly Ala Leu Pro Leu Asp AspAla Ala Arg 1745 1750 1755 1760 Val Val Thr Leu Arg Ser Lys Ser Ile AlaAla His Leu Ala Gly Lys 1765 1770 1775 Gly Gly Met Leu Ser Leu Ala LeuAsn Glu Asp Ala Val Leu Glu Arg 1780 1785 1790 Leu Ser Asp Phe Asp GlyLeu Ser Val Ala Ala Val Asn Gly Pro Thr 1795 1800 1805 Ala Thr Val ValSer Gly Asp Pro Val Gln Ile Glu Glu Leu Ala Gln 1810 1815 1820 Ala CysLys Ala Asp Gly Phe Arg Ala Arg Ile Ile Pro Val Asp Tyr 1825 1830 18351840 Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu Ala Gln1845 1850 1855 Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro PhePhe Ser 1860 1865 1870 Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val LeuAsp Gly Thr Tyr 1875 1880 1885 Trp Tyr Arg Asn Leu Arg His Arg Val GlyPhe Ala Pro Ala Ile Glu 1890 1895 1900 Thr Leu Ala Val Asp Glu Gly PheThr His Phe Val Glu Val Ser Ala 1905 1910 1915 1920 His Pro Val Leu ThrMet Thr Leu Pro Glu Thr Val Thr Gly Leu Gly 1925 1930 1935 Thr Leu ArgArg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr Ser Leu 1940 1945 1950 AlaGlu Ala Trp Val Asn Gly Leu Pro Val Ala Trp Thr Ser Leu Leu 1955 19601965 Pro Ala Thr Ala Ser Arg Pro Gly Leu Pro Thr Tyr Ala Phe Gln Ala1970 1975 1980 Glu Arg Tyr Trp Leu Glu Asn Thr Pro Ala Ala Leu Ala ThrGly Asp 1985 1990 1995 2000 Asp Trp Arg Tyr Arg Ile Asp Trp Lys Arg LeuPro Ala Ala Glu Gly 2005 2010 2015 Ser Glu Arg Thr Gly Leu Ser Gly ArgTrp Leu Ala Val Thr Pro Glu 2020 2025 2030 Asp His Ser Ala Gln Ala AlaAla Val Leu Thr Ala Leu Val Asp Ala 2035 2040 2045 Gly Ala Lys Val GluVal Leu Thr Ala Gly Ala Asp Asp Asp Arg Glu 2050 2055 2060 Ala Leu AlaAla Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr 2065 2070 2075 2080Gly Val Val Ser Leu Leu Asp Gly Leu Val Pro Gln Val Ala Trp Val 20852090 2095 Gln Ala Leu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp Ser ValThr 2100 2105 2110 Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro AlaAsp Pro Asp 2115 2120 2125 Arg Ala Met Leu Trp Gly Leu Gly Arg Val ValAla Leu Glu His Pro 2130 2135 2140 Glu Arg Trp Ala Gly Leu Val Asp LeuPro Ala Gln Pro Asp Ala Ala 2145 2150 2155 2160 Ala Leu Ala His Leu ValThr Ala Leu Ser Gly Ala Thr Gly Glu Asp 2165 2170 2175 Gln Ile Ala IleArg Thr Thr Gly Leu His Ala Arg Arg Leu Ala Arg 2180 2185 2190 Ala ProLeu His Gly Arg Arg Pro Thr Arg Asp Trp Gln Pro His Gly 2195 2200 2205Thr Val Leu Ile Thr Gly Gly Thr Gly Ala Leu Gly Ser His Ala Ala 22102215 2220 Arg Trp Met Ala His His Gly Ala Glu His Leu Leu Leu Val SerArg 2225 2230 2235 2240 Ser Gly Glu Gln Ala Pro Gly Ala Thr Gln Leu ThrAla Glu Leu Thr 2245 2250 2255 Ala Ser Gly Ala Arg Val Thr Ile Ala AlaCys Asp Val Ala Asp Pro 2260 2265 2270 His Ala Met Arg Thr Leu Leu AspAla Ile Pro Ala Glu Thr Pro Leu 2275 2280 2285 Thr Ala Val Val His ThrAla Gly Ala Leu Asp Asp Gly Ile Val Asp 2290 2295 2300 Thr Leu Thr AlaGlu Gln Val Arg Arg Ala His Arg Ala Lys Ala Val 2305 2310 2315 2320 GlyAla Ser Val Leu Asp Glu Leu Thr Arg Asp Leu Asp Leu Asp Ala 2325 23302335 Phe Val Leu Phe Ser Ser Val Ser Ser Thr Leu Gly Ile Pro Gly Gln2340 2345 2350 Gly Asn Tyr Ala Pro His Asn Ala Tyr Leu Asp Ala Leu AlaAla Arg 2355 2360 2365 Arg Arg Ala Thr Gly Arg Ser Ala Val Ser Val AlaTrp Gly Pro Trp 2370 2375 2380 Asp Gly Gly Gly Met Ala Ala Gly Asp GlyVal Ala Glu Arg Leu Arg 2385 2390 2395 2400 Asn His Gly Val Pro Gly MetAsp Pro Glu Leu Ala Leu Ala Ala Leu 2405 2410 2415 Glu Ser Ala Leu GlyArg Asp Glu Thr Ala Ile Thr Val Ala Asp Ile 2420 2425 2430 Asp Trp AspArg Phe Tyr Leu Ala Tyr Ser Ser Gly Arg Pro Gln Pro 2435 2440 2445 LeuVal Glu Glu Leu Pro Glu Val Arg Arg Ile Ile Asp Ala Arg Asp 2450 24552460 Ser Ala Thr Ser Gly Gln Gly Gly Ser Ser Ala Gln Gly Ala Asn Pro2465 2470 2475 2480 Leu Ala Glu Arg Leu Ala Ala Ala Ala Pro Gly Glu ArgThr Glu Ile 2485 2490 2495 Leu Leu Gly Leu Val Arg Ala Gln Ala Ala AlaVal Leu Arg Met Arg 2500 2505 2510 Ser Pro Glu Asp Val Ala Ala Asp ArgAla Phe Lys Asp Ile Gly Phe 2515 2520 2525 Asp Ser Leu Ala Gly Val GluLeu Arg Asn Arg Leu Thr Arg Ala Thr 2530 2535 2540 Gly Leu Gln Leu ProAla Thr Leu Val Phe Asp His Pro Thr Pro Leu 2545 2550 2555 2560 Ala LeuVal Ser Leu Leu Arg Ser Glu Phe Leu Gly Asp Glu Glu Thr 2565 2570 2575Ala Asp Ala Arg Arg Ser Ala Ala Leu Pro Ala Thr Val Gly Ala Gly 25802585 2590 Ala Gly Ala Gly Ala Gly Thr Asp Ala Asp Asp Asp Pro Ile AlaIle 2595 2600 2605 Val Ala Met Ser Cys Arg Tyr Pro Gly Asp Ile Arg SerPro Glu Asp 2610 2615 2620 Leu Trp Arg Met Leu Ser Glu Gly Gly Glu GlyIle Thr Pro Phe Pro 2625 2630 2635 2640 Thr Asp Arg Gly Trp Asp Leu AspGly Leu Tyr Asp Ala Asp Pro Asp 2645 2650 2655 Ala Leu Gly Arg Ala TyrVal Arg Glu Gly Gly Phe Leu His Asp Ala 2660 2665 2670 Ala Glu Phe AspAla Glu Phe Phe Gly Val Ser Pro Arg Glu Ala Leu 2675 2680 2685 Ala MetAsp Pro Gln Gln Arg Met Leu Leu Thr Thr Ser Trp Glu Ala 2690 2695 2700Phe Glu Arg Ala Gly Ile Glu Pro Ala Ser Leu Arg Gly Ser Ser Thr 27052710 2715 2720 Gly Val Phe Ile Gly Leu Ser Tyr Gln Asp Tyr Ala Ala ArgVal Pro 2725 2730 2735 Asn Ala Pro Arg Gly Val Glu Gly Tyr Leu Leu ThrGly Ser Thr Pro 2740 2745 2750 Ser Val Ala Ser Gly Arg Ile Ala Tyr ThrPhe Gly Leu Glu Gly Pro 2755 2760 2765 Ala Thr Thr Val Asp Thr Ala CysSer Ser Ser Leu Thr Ala Leu His 2770 2775 2780 Leu Ala Val Arg Ala LeuArg Ser Gly Glu Cys Thr Met Ala Leu Ala 2785 2790 2795 2800 Gly Gly ValAla Met Met Ala Thr Pro His Met Phe Val Glu Phe Ser 2805 2810 2815 ArgGln Arg Ala Leu Ala Pro Asp Gly Arg Ser Lys Ala Phe Ser Ala 2820 28252830 Asp Ala Asp Gly Phe Gly Ala Ala Glu Gly Val Gly Leu Leu Leu Val2835 2840 2845 Glu Arg Leu Ser Asp Ala Arg Arg Asn Gly His Pro Val LeuAla Val 2850 2855 2860 Val Arg Gly Thr Ala Val Asn Gln Asp Gly Ala SerAsn Gly Leu Thr 2865 2870 2875 2880 Ala Pro Asn Gly Pro Ser Gln Gln ArgVal Ile Arg Gln Ala Leu Ala 2885 2890 2895 Asp Ala Arg Leu Ala Pro GlyAsp Ile Asp Ala Val Glu Thr His Gly 2900 2905 2910 Thr Gly Thr Ser LeuGly Asp Pro Ile Glu Ala Gln Gly Leu Gln Ala 2915 2920 2925 Thr Tyr GlyLys Glu Arg Pro Ala Glu Arg Pro Leu Ala Ile Gly Ser 2930 2935 2940 ValLys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Ala Ala Gly 2945 29502955 2960 Ile Ile Lys Met Val Leu Ala Met Arg His Gly Thr Leu Pro LysThr 2965 2970 2975 Leu His Ala Asp Glu Pro Ser Pro His Val Asp Trp AlaAsn Ser Gly 2980 2985 2990 Leu Ala Leu Val Thr Glu Pro Ile Asp Trp ProAla Gly Thr Gly Pro 2995 3000 3005 Arg Arg Ala Ala Val Ser Ser Phe GlyIle Ser Gly Thr Asn Ala His 3010 3015 3020 Val Val Leu Glu Gln Ala ProAsp Ala Ala Gly Glu Val Leu Gly Ala 3025 3030 3035 3040 Asp Glu Val ProGlu Val Ser Glu Thr Val Ala Met Ala Gly Thr Ala 3045 3050 3055 Gly ThrSer Glu Val Ala Glu Gly Ser Glu Ala Ser Glu Ala Pro Ala 3060 3065 3070Ala Pro Gly Ser Arg Glu Ala Ser Leu Pro Gly His Leu Pro Trp Val 30753080 3085 Leu Ser Ala Lys Asp Glu Gln Ser Leu Arg Gly Gln Ala Ala AlaLeu 3090 3095 3100 His Ala Trp Leu Ser Glu Pro Ala Ala Asp Leu Ser AspAla Asp Gly 3105 3110 3115 3120 Pro Ala Arg Leu Arg Asp Val Gly Tyr ThrLeu Ala Thr Ser Arg Thr 3125 3130 3135 Ala Phe Ala His Arg Ala Ala ValThr Ala Ala Asp Arg Asp Gly Phe 3140 3145 3150 Leu Asp Gly Leu Ala ThrLeu Ala Gln Gly Gly Thr Ser Ala His Val 3155 3160 3165 His Leu Asp ThrAla Arg Asp Gly Thr Thr Ala Phe Leu Phe Thr Gly 3170 3175 3180 Gln GlySer Gln Arg Pro Gly Ala Gly Arg Glu Leu Tyr Asp Arg His 3185 3190 31953200 Pro Val Phe Ala Arg Ala Leu Asp Glu Ile Cys Ala His Leu Asp Gly3205 3210 3215 His Leu Glu Leu Pro Leu Leu Asp Val Met Phe Ala Ala GluGly Ser 3220 3225 3230 Ala Glu Ala Ala Leu Leu Asp Glu Thr Arg Tyr ThrGln Cys Ala Leu 3235 3240 3245 Phe Ala Leu Glu Val Ala Leu Phe Arg LeuVal Glu Ser Trp Gly Met 3250 3255 3260 Arg Pro Ala Ala Leu Leu Gly HisSer Val Gly Glu Ile Ala Ala Ala 3265 3270 3275 3280 His Val Ala Gly ValPhe Ser Leu Ala Asp Ala Ala Arg Leu Val Ala 3285 3290 3295 Ala Arg GlyArg Leu Met Gln Glu Leu Pro Ala Gly Gly Ala Met Leu 3300 3305 3310 AlaVal Gln Ala Ala Glu Asp Glu Ile Arg Val Trp Leu Glu Thr Glu 3315 33203325 Glu Arg Tyr Ala Gly Arg Leu Asp Val Ala Ala Val Asn Gly Pro Glu3330 3335 3340 Ala Ala Val Leu Ser Gly Asp Ala Asp Ala Ala Arg Glu AlaGlu Ala 3345 3350 3355 3360 Tyr Trp Ser Gly Leu Gly Arg Arg Thr Arg AlaLeu Arg Val Ser His 3365 3370 3375 Ala Phe His Ser Ala His Met Asp GlyMet Leu Asp Gly Phe Arg Ala 3380 3385 3390 Val Leu Glu Thr Val Glu PheArg Arg Pro Ser Leu Thr Val Val Ser 3395 3400 3405 Asn Val Thr Gly LeuAla Ala Gly Pro Asp Asp Leu Cys Asp Pro Glu 3410 3415 3420 Tyr Trp ValArg His Val Arg Gly Thr Val Arg Phe Leu Asp Gly Val 3425 3430 3435 3440Arg Val Leu Arg Asp Leu Gly Val Arg Thr Cys Leu Glu Leu Gly Pro 34453450 3455 Asp Gly Val Leu Thr Ala Met Ala Ala Asp Gly Leu Ala Asp ThrPro 3460 3465 3470 Ala Asp Ser Ala Ala Gly Ser Pro Val Gly Ser Pro AlaGly Ser Pro 3475 3480 3485 Ala Asp Ser Ala Ala Gly Ala Leu Arg Pro ArgPro Leu Leu Val Ala 3490 3495 3500 Leu Leu Arg Arg Lys Arg Ser Glu ThrGlu Thr Val Ala Asp Ala Leu 3505 3510 3515 3520 Gly Arg Ala His Ala HisGly Thr Gly Pro Asp Trp His Ala Trp Phe 3525 3530 3535 Ala Gly Ser GlyAla His Arg Val Asp Leu Pro Thr Tyr Ser Phe Arg 3540 3545 3550 Arg AspArg Tyr Trp Leu Asp Ala Pro Ala Ala Asp Thr Ala Val Asp 3555 3560 3565Thr Ala Gly Leu Gly Leu Gly Thr Ala Asp His Pro Leu Leu Gly Ala 35703575 3580 Val Val Ser Leu Pro Asp Arg Asp Gly Leu Leu Leu Thr Gly ArgLeu 3585 3590 3595 3600 Ser Leu Arg Thr His Pro Trp Leu Ala Asp His AlaVal Leu Gly Ser 3605 3610 3615 Val Leu Leu Pro Gly Ala Ala Met Val GluLeu Ala Ala His Ala Ala 3620 3625 3630 Glu Ser Ala Gly Leu Arg Asp ValArg Glu Leu Thr Leu Leu Glu Pro 3635 3640 3645 Leu Val Leu Pro Glu HisGly Gly Val Glu Leu Arg Val Thr Val Gly 3650 3655 3660 Ala Pro Ala GlyGlu Pro Gly Gly Glu Ser Ala Gly Asp Gly Ala Arg 3665 3670 3675 3680 ProVal Ser Leu His Ser Arg Leu Ala Asp Ala Pro Ala Gly Thr Ala 3685 36903695 Trp Ser Cys His Ala Thr Gly Leu Leu Ala Thr Asp Arg Pro Glu Leu3700 3705 3710 Pro Val Ala Pro Asp Arg Ala Ala Met Trp Pro Pro Gln GlyAla Glu 3715 3720 3725 Glu Val Pro Leu Asp Gly Leu Tyr Glu Arg Leu AspGly Asn Gly Leu 3730 3735 3740 Ala Phe Gly Pro Leu Phe Gln Gly Leu AsnAla Val Trp Arg Tyr Glu 3745 3750 3755 3760 Gly Glu Val Phe Ala Asp IleAla Leu Pro Ala Thr Thr Asn Ala Thr 3765 3770 3775 Ala Pro Ala Thr AlaAsn Gly Gly Gly Ser Ala Ala Ala Ala Pro Tyr 3780 3785 3790 Gly Ile HisPro Ala Leu Leu Asp Ala Ser Leu His Ala Ile Ala Val 3795 3800 3805 GlyGly Leu Val Asp Glu Pro Glu Leu Val Arg Val Pro Phe His Trp 3810 38153820 Ser Gly Val Thr Val His Ala Ala Gly Ala Ala Ala Ala Arg Val Arg3825 3830 3835 3840 Leu Ala Ser Ala Gly Thr Asp Ala Val Ser Leu Ser LeuThr Asp Gly 3845 3850 3855 Glu Gly Arg Pro Leu Val Ser Val Glu Arg LeuThr Leu Arg Pro Val 3860 3865 3870 Thr Ala Asp Gln Ala Ala Ala Ser ArgVal Gly Gly Leu Met His Arg 3875 3880 3885 Val Ala Trp Arg Pro Tyr AlaLeu Ala Ser Ser Gly Glu Gln Asp Pro 3890 3895 3900 His Ala Thr Ser TyrGly Pro Thr Ala Val Leu Gly Lys Asp Glu Leu 3905 3910 3915 3920 Lys ValAla Ala Ala Leu Glu Ser Ala Gly Val Glu Val Gly Leu Tyr 3925 3930 3935Pro Asp Leu Ala Ala Leu Ser Gln Asp Val Ala Ala Gly Ala Pro Ala 39403945 3950 Pro Arg Thr Val Leu Ala Pro Leu Pro Ala Gly Pro Ala Asp GlyGly 3955 3960 3965 Ala Glu Gly Val Arg Gly Thr Val Ala Arg Thr Leu GluLeu Leu Gln 3970 3975 3980 Ala Trp Leu Ala Asp Glu His Leu Ala Gly ThrArg Leu Leu Leu Val 3985 3990 3995 4000 Thr Arg Gly Ala Val Arg Asp ProGlu Gly Ser Gly Ala Asp Asp Gly 4005 4010 4015 Gly Glu Asp Leu Ser HisAla Ala Ala Trp Gly Leu Val Arg Thr Ala 4020 4025 4030 Gln Thr Glu AsnPro Gly Arg Phe Gly Leu Leu Asp Leu Ala Asp Asp 4035 4040 4045 Ala SerSer Tyr Arg Thr Leu Pro Ser Val Leu Ser Asp Ala Gly Leu 4050 4055 4060Arg Asp Glu Pro Gln Leu Ala Leu His Asp Gly Thr Ile Arg Leu Ala 40654070 4075 4080 Arg Leu Ala Ser Val Arg Pro Glu Thr Gly Thr Ala Ala ProAla Leu 4085 4090 4095 Ala Pro Glu Gly Thr Val Leu Leu Thr Gly Gly ThrGly Gly Leu Gly 4100 4105 4110 Gly Leu Val Ala Arg His Val Val Gly GluTrp Gly Val Arg Arg Leu 4115 4120 4125 Leu Leu Val Ser Arg Arg Gly ThrAsp Ala Pro Gly Ala Asp Glu Leu 4130 4135 4140 Val His Glu Leu Glu AlaLeu Gly Ala Asp Val Ser Val Ala Ala Cys 4145 4150 4155 4160 Asp Val AlaAsp Arg Glu Ala Leu Thr Ala Val Leu Asp Ala Ile Pro 4165 4170 4175 AlaGlu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Ser 4180 41854190 Asp Gly Thr Leu Pro Ser Met Thr Thr Glu Asp Val Glu His Val Leu4195 4200 4205 Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu ThrSer Thr 4210 4215 4220 Pro Ala Tyr Asp Leu Ala Ala Phe Val Met Phe SerSer Ala Ala Ala 4225 4230 4235 4240 Val Phe Gly Gly Ala Gly Gln Gly AlaTyr Ala Ala Ala Asn Ala Thr 4245 4250 4255 Leu Asp Ala Leu Ala Trp ArgArg Arg Ala Ala Gly Leu Pro Ala Leu 4260 4265 4270 Ser Leu Gly Trp GlyLeu Trp Ala Glu Thr Ser Gly Met Thr Gly Glu 4275 4280 4285 Leu Gly GlnAla Asp Leu Arg Arg Met Ser Arg Ala Gly Ile Gly Gly 4290 4295 4300 IleSer Asp Ala Glu Gly Ile Ala Leu Leu Asp Ala Ala Leu Arg Asp 4305 43104315 4320 Asp Arg His Pro Val Leu Leu Pro Leu Arg Leu Asp Ala Ala GlyLeu 4325 4330 4335 Arg Asp Ala Ala Gly Asn Asp Pro Ala Gly Ile Pro AlaLeu Phe Arg 4340 4345 4350 Asp Val Val Gly Ala Arg Thr Val Arg Ala ArgPro Ser Ala Ala Ser 4355 4360 4365 Ala Ser Thr Thr Ala Gly Thr Ala GlyThr Pro Gly Thr Ala Asp Gly 4370 4375 4380 Ala Ala Glu Thr Ala Ala ValThr Leu Ala Asp Arg Ala Ala Thr Val 4385 4390 4395 4400 Asp Gly Pro AlaArg Gln Arg Leu Leu Leu Glu Phe Val Val Gly Glu 4405 4410 4415 Val AlaGlu Val Leu Gly His Ala Arg Gly His Arg Ile Asp Ala Glu 4420 4425 4430Arg Gly Phe Leu Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu 44354440 4445 Arg Asn Arg Leu Asn Ser Ala Gly Gly Leu Ala Leu Pro Ala ThrLeu 4450 4455 4460 Val Phe Asp His Pro Ser Pro Ala Ala Leu Ala Ser HisLeu Asp Ala 4465 4470 4475 4480 Glu Leu Pro Arg Gly Ala Ser Asp Gln AspGly Ala Gly Asn Arg Asn 4485 4490 4495 Gly Asn Glu Asn Gly Thr Thr AlaSer Arg Ser Thr Ala Glu Thr Asp 4500 4505 4510 Ala Leu Leu Ala Gln LeuThr Arg Leu Glu Gly Ala Leu Val Leu Thr 4515 4520 4525 Gly Leu Ser AspAla Pro Gly Ser Glu Glu Val Leu Glu His Leu Arg 4530 4535 4540 Ser LeuArg Ser Met Val Thr Gly Glu Thr Gly Thr Gly Thr Ala Ser 4545 4550 45554560 Gly Ala Pro Asp Gly Ala Gly Ser Gly Ala Glu Asp Arg Pro Trp Ala4565 4570 4575 Ala Gly Asp Gly Ala Gly Gly Gly Ser Glu Asp Gly Ala GlyVal Pro 4580 4585 4590 Asp Phe Met Asn Ala Ser Ala Glu Glu Leu Phe GlyLeu Leu Asp Gln 4595 4600 4605 Asp Pro Ser Thr Asp 4610 32 11220 DNAStreptomyces venezuelae 32 gtgtccacgg tgaacgaaga gaagtacctc gactacctgcgtcgtgccac ggcggacctc 60 cacgaggccc gtggccgcct ccgcgagctg gaggcgaaggcgggcgagcc ggtggcgatc 120 gtcggcatgg cctgccgcct gcccggcggc gtcgcctcgcccgaggacct gtggcggctg 180 gtggccggcg gcgaggacgc gatctcggag ttcccccaggaccgcggctg ggacgtggag 240 ggcctgtacg acccgaaccc ggaggccacg ggcaagagttacgcccgcga ggccggattc 300 ctgtacgagg cgggcgagtt cgacgccgac ttcttcgggatctcgccgcg cgaggccctc 360 gccatggacc cgcagcagcg tctcctcctg gaggcctcctgggaggcgtt cgagcacgcc 420 gggatcccgg cggccaccgc gcgcggcacc tcggtcggcgtcttcaccgg cgtgatgtac 480 cacgactacg ccacccgtct caccgatgtc ccggagggcatcgagggcta cctgggcacc 540 ggcaactccg gcagtgtcgc ctcgggccgc gtcgcgtacacgcttggcct ggaggggccg 600 gccgtcacgg tcgacaccgc ctgctcgtcc tcgctggtcgccctgcacct cgccgtgcag 660 gccctgcgca agggcgaggt cgacatggcg ctcgccggcggcgtgacggt catgtcgacg 720 cccagcacct tcgtcgagtt cagccgtcag cgcgggctggcgccggacgg ccggtcgaag 780 tccttctcgt cgacggccga cggcaccagc tggtccgagggcgtcggcgt cctcctcgtc 840 gagcgcctgt ccgacgcgcg tcgcaagggc catcggatcctcgccgtggt ccggggcacc 900 gccgtcaacc aggacggcgc cagcagcggc ctcacggctccgaacgggcc gtcgcagcag 960 cgcgtcatcc gacgtgccct ggcggacgcc cggctcacgacctccgacgt ggacgtcgtc 1020 gaggcccacg gcacgggtac gcgactcggc gacccgatcgaggcgcaggc cgtcatcgcc 1080 acgtacgggc agggccgtga cggcgaacag ccgctgcgcctcgggtcgtt gaagtccaac 1140 atcggacaca cccaggccgc cgccggtgtc tccggcgtgatcaagatggt ccaggcgatg 1200 cgccacggcg tcctgccgaa gacgctccac gtggagaagccgacggacca ggtggactgg 1260 tccgcgggcg cggtcgagct gctcaccgag gccatggactggccggacaa gggcgacggc 1320 ggactgcgca gggccgcggt ctcctccttc ggcgtcagcgggacgaacgc gcacgtcgtg 1380 ctcgaagagg ccccggcggc cgaggagacc cctgcctccgaggcgacccc ggccgtcgag 1440 ccgtcggtcg gcgccggcct ggtgccgtgg ctggtgtcggcgaagactcc ggccgcgctg 1500 gacgcccaga tcggacgcct cgccgcgttc gcctcgcagggccgtacgga cgccgccgat 1560 ccgggcgcgg tcgctcgcgt actggccggc gggcgcgccgagttcgagca ccgggccgtc 1620 gtgctcggca ccggacagga cgatttcgcg caggcgctgaccgctccgga aggactgata 1680 cgcggcacgc cctcggacgt gggccgggtg gcgttcgtgttccccggtca gggcacgcag 1740 tgggccggga tgggcgccga actcctcgac gtgtcgaaggagttcgcggc ggccatggcc 1800 gagtgcgaga gcgcgctctc ccgctatgtc gactggtcgctggaggccgt cgtccggcag 1860 gcgccgggcg cgcccacgct ggagcgggtc gacgtcgtccagcccgtgac cttcgctgtc 1920 atggtttcgc tggcgaaggt ctggcagcac cacggcgtgacgccgcaggc cgtcgtcggc 1980 cactcgcagg gcgagatcgc cgccgcgtac gtcgccggtgccctcaccct cgacgacgcc 2040 gcccgcgtcg tcaccctgcg cagcaagtcc atcgccgcccacctcgccgg caagggcggc 2100 atgatctccc tcgccctcag cgaggaagcc acccggcagcgcatcgagaa cctccacgga 2160 ctgtcgatcg ccgccgtcaa cggccccacc gccaccgtggtttcgggcga ccccacccag 2220 atccaagagc tcgctcaggc gtgtgaggcc gacggggtccgcgcacggat catccccgtc 2280 gactacgcct cccacagcgc ccacgtcgag accatcgagagcgaactcgc cgaggtcctc 2340 gccgggctca gcccgcggac acctgaggtg ccgttcttctcgacactcga aggcgcctgg 2400 atcaccgagc cggtgctcga cggcacctac tggtaccgcaacctccgcca ccgcgtcggc 2460 ttcgcccccg ccgtcgagac cctcgccacc gacgaaggcttcacccactt catcgaggtc 2520 agcgcccacc ccgtcctcac catgaccctc cccgagaccgtcaccggcct cggcaccctc 2580 cgccgcgaac agggaggcca ggagcgtctg gtcacctcactcgccgaagc ctggaccaac 2640 ggcctcacca tcgactgggc gcccgtcctc cccaccgcaaccggccacca ccccgagctc 2700 cccacctacg ccttccagcg ccgtcactac tggctccacgactcccccgc cgtccagggc 2760 tccgtgcagg actcctggcg ctaccgcatc gactggaagcgcctcgcggt cgccgacgcg 2820 tccgagcgcg ccgggctgtc cgggcgctgg ctcgtcgtcgtccccgagga ccgttccgcc 2880 gaggccgccc cggtgctcgc cgcgctgtcc ggcgccggcgccgaccccgt acagctggac 2940 gtgtccccgc tgggcgaccg gcagcggctc gccgcgacgctgggcgaggc cctggcggcg 3000 gccggtggag ccgtcgacgg cgtcctctcg ctgctcgcgtgggacgagag cgcgcacccc 3060 ggccaccccg cccccttcac ccggggcacc ggcgccaccctcaccctggt gcaggcgctg 3120 gaggacgccg gcgtcgccgc cccgctgtgg tgcgtgacccacggcgcggt gtccgtcggc 3180 cgggccgacc acgtcacctc ccccgcccag gccatggtgtggggcatggg ccgggtcgcc 3240 gccctggagc accccgagcg gtggggcggc ctgatcgacctgccctcgga cgccgaccgg 3300 gcggccctgg accgcatgac cacggtcctc gccggcggtacgggtgagga ccaggtcgcg 3360 gtacgcgcct ccgggctgct cgcccgccgc ctcgtccgcgcctccctccc ggcgcacggc 3420 acggcttcgc cgtggtggca ggccgacggc acggtgctcgtcaccggtgc cgaggagcct 3480 gcggccgccg aggccgcacg ccggctggcc cgcgacggcgccggacacct cctcctccac 3540 accaccccct ccggcagcga aggcgccgaa ggcacctccggtgccgccga ggactccggc 3600 ctcgccgggc tcgtcgccga actcgcggac ctgggcgcgacggccaccgt cgtgacctgc 3660 gacctcacgg acgcggaggc ggccgcccgg ctgctcgccggcgtctccga cgcgcacccg 3720 ctcagcgccg tcctccacct gccgcccacc gtcgactccgagccgctcgc cgcgaccgac 3780 gcggacgcgc tcgcccgtgt cgtgaccgcg aaggccaccgccgcgctcca cctggaccgc 3840 ctcctgcggg aggccgcggc tgccggaggc cgtccgcccgtcctggtcct cttctcctcg 3900 gtcgccgcga tctggggcgg cgccggtcag ggcgcgtacgccgccggtac ggccttcctc 3960 gacgccctcg ccggtcagca ccgggccgac ggccccaccgtgacctcggt ggcctggagc 4020 ccctgggagg gcagccgcgt caccgagggt gcgaccggggagcggctgcg ccgcctcggc 4080 ctgcgccccc tcgcccccgc gacggcgctc accgccctggacaccgcgct cggccacggc 4140 gacaccgccg tcacgatcgc cgacgtcgac tggtcgagcttcgcccccgg cttcaccacg 4200 gcccggccgg gcaccctcct cgccgatctg cccgaggcgcgccgcgcgct cgacgagcag 4260 cagtcgacga cggccgccga cgacaccgtc ctgagccgcgagctcggtgc gctcaccggc 4320 gccgaacagc agcgccgtat gcaggagttg gtccgcgagcacctcgccgt ggtcctcaac 4380 cacccctccc ccgaggccgt cgacacgggg cgggccttccgtgacctcgg attcgactcg 4440 ctgacggcgg tcgagctccg caaccgcctc aagaacgccaccggcctggc cctcccggcc 4500 actctggtct tcgactaccc gaccccccgg acgctggcggagttcctcct cgcggagatc 4560 ctgggcgagc aggccggtgc cggcgagcag cttccggtggacggcggggt cgacgacgag 4620 cccgtcgcga tcgtcggcat ggcgtgccgc ctgccgggcggtgtcgcctc gccggaggac 4680 ctgtggcggc tggtggccgg cggcgaggac gcgatctccggcttcccgca ggaccgcggc 4740 tgggacgtgg aggggctgta cgacccggac ccggacgcgtccgggcggac gtactgccgt 4800 gccggtggct tcctcgacga ggcgggcgag ttcgacgccgacttcttcgg gatctcgccg 4860 cgcgaggccc tcgccatgga cccgcagcag cggctcctcctggagacctc ctgggaggcc 4920 gtcgaggacg ccgggatcga cccgacctcc cttcaggggcagcaggtcgg cgtgttcgcg 4980 ggcaccaacg gcccccacta cgagccgctg ctccgcaacaccgccgagga tcttgagggt 5040 tacgtcggga cgggcaacgc cgccagcatc atgtcgggccgtgtctcgta caccctcggc 5100 ctggagggcc cggccgtcac ggtcgacacc gcctgctcctcctcgctggt cgccctgcac 5160 ctcgccgtgc aggccctgcg caagggcgaa tgcggactggcgctcgcggg cggtgtgacg 5220 gtcatgtcga cgcccacgac gttcgtggag ttcagccggcagcgcgggct cgcggaggac 5280 ggccggtcga aggcgttcgc cgcgtcggcg gacggcttcggcccggcgga gggcgtcggc 5340 atgctcctcg tcgagcgcct gtcggacgcc cgccgcaacggacaccgtgt gctggcggtc 5400 gtgcgcggca gcgcggtcaa ccaggacggc gcgagcaacggcctgaccgc cccgaacggg 5460 ccctcgcagc agcgcgtcat ccggcgcgcg ctcgcggacgcccgactgac gaccgccgac 5520 gtggacgtcg tcgaggccca cggcacgggc acgcgactcggcgacccgat cgaggcacag 5580 gccctcatcg ccacctacgg ccaggggcgc gacaccgaacagccgctgcg cctggggtcg 5640 ttgaagtcca acatcggaca cacccaggcc gccgccggtgtctccggcat catcaagatg 5700 gtccaggcga tgcgccacgg cgtcctgccg aagacgctccacgtggaccg gccgtcggac 5760 cagatcgact ggtcggcggg cacggtcgag ctgctcaccgaggccatgga ctggccgagg 5820 aagcaggagg gcgggctgcg ccgcgcggcc gtctcctccttcggcatcag cggcacgaac 5880 gcgcacatcg tgctcgaaga agccccggtc gacgaggacgccccggcgga cgagccgtcg 5940 gtcggcggtg tggtgccgtg gctcgtgtcc gcgaagactccggccgcgct ggacgcccag 6000 atcggacgcc tcgccgcgtt cgcctcgcag ggccgtacggacgccgccga tccgggcgcg 6060 gtcgctcgcg tactggccgg cgggcgtgcg cagttcgagcaccgggccgt cgcgctcggc 6120 accggacagg acgacctggc ggccgcactg gccgcgcctgagggtctggt ccggggtgtg 6180 gcctccggtg tgggtcgagt ggcgttcgtg ttcccgggacagggcacgca gtgggccggg 6240 atgggtgccg aactcctcga cgtgtcgaag gagttcgcggcggccatggc cgagtgcgag 6300 gccgcgctcg ctccgtacgt ggactggtcg ctggaggccgtcgtccgaca ggcccccggc 6360 gcgcccacgc tggagcgggt cgatgtcgtc cagcccgtgacgttcgccgt catggtctcg 6420 ctggcgaagg tctggcagca ccacggggtg accccgcaagccgtcgtcgg ccactcgcag 6480 ggcgagatcg ccgccgcgta cgtcgccggt gccctgagcctggacgacgc cgctcgtgtc 6540 gtgaccctgc gcagcaagtc catcggcgcc cacctcgcgggccagggcgg catgctgtcc 6600 ctcgcgctga gcgaggcggc cgttgtggag cgactggccgggttcgacgg gctgtccgtc 6660 gccgccgtca acgggcctac cgccaccgtg gtttcgggcgacccgaccca gatccaagag 6720 ctcgctcagg cgtgtgaggc cgacggggtc cgcgcacggatcatccccgt cgactacgcc 6780 tcccacagcg cccacgtcga gaccatcgag agcgaactcgccgacgtcct ggcggggttg 6840 tccccccaga caccccaggt ccccttcttc tccaccctcgaaggcgcctg gatcaccgaa 6900 cccgccctcg acggcggcta ctggtaccgc aacctccgccatcgtgtggg cttcgccccg 6960 gccgtcgaaa ccctggccac cgacgaaggc ttcacccacttcgtcgaggt cagcgcccac 7020 cccgtcctca ccatggcgct gcccgagacc gtcaccggactcggcaccct ccgccgtgac 7080 aacggcggac agcaccgcct caccacctcc ctcgccgaggcctgggccaa cggcctcacc 7140 gtcgactggg cctctctcct ccccaccacg accacccaccccgatctgcc cacctacgcc 7200 ttccagaccg agcgctactg gccgcagccc gacctctccgccgccggtga catcacctcc 7260 gccggtctcg gggcggccga gcacccgctg ctcggcgcggccgtggcgct cgcggactcc 7320 gacggctgcc tgctcacggg gagcctctcc ctccgtacgcacccctggct ggcggaccac 7380 gcggtggccg gcaccgtgct gctgccggga acggcgttcgtggagctggc gttccgagcc 7440 ggggaccagg tcggttgcga tctggtcgag gagctcaccctcgacgcgcc gctcgtgctg 7500 ccccgtcgtg gcgcggtccg tgtgcagctg tccgtcggcgcgagcgacga gtccgggcgt 7560 cgtaccttcg ggctctacgc gcacccggag gacgcgccgggcgaggcgga gtggacgcgg 7620 cacgccaccg gtgtgctggc cgcccgtgcg gaccgcaccgcccccgtcgc cgacccggag 7680 gcctggccgc cgccgggcgc cgagccggtg gacgtggacggtctgtacga gcgcttcgcg 7740 gcgaacggct acggctacgg ccccctcttc cagggcgtccgtggtgtctg gcggcgtggc 7800 gacgaggtgt tcgccgacgt ggccctgccg gccgaggtcgccggtgccga gggcgcgcgg 7860 ttcggccttc acccggcgct gctcgacgcc gccgtgcaggcggccggtgc gggccggggc 7920 gttcggcgcg ggcacgcggc tgccgttcgc ctggagcgggatctcctgta cgcggtcggc 7980 gccaccgccc tccgcgtgcg gctggccccc gccggcccggacacggtgtc cgtgagcgcc 8040 gccgactcct ccgggcagcc ggtgttcgcc gcggactccctcacggtgct gcccgtcgac 8100 cccgcgcagc tggcggcctt cagcgacccg actctggacgcgctgcacct gctggagtgg 8160 accgcctggg acggtgccgc gcaggccctg cccggcgcggtcgtgctggg cggcgacgcc 8220 gacggtctcg ccgcggcgct gcgcgccggt ggcaccgaggtcctgtcctt cccggacctt 8280 acggacctgg tggaggccgt cgaccggggc gagaccccggccccggcgac cgtcctggtg 8340 gcctgccccg ccgccggccc cgatgggccg gagcatgtccgcgaggccct gcacgggtcg 8400 ctcgcgctga tgcaggcctg gctggccgac gagcggttcaccgatgggcg cctggtgctc 8460 gtgacccgcg acgcggtcgc cgcccgttcc ggcgacggcctgcggtccac gggacaggcc 8520 gccgtctggg gcctcggccg gtccgcgcag acggagagcccgggccggtt cgtcctgctc 8580 gacctcgccg gggaagcccg gacggccggg gacgccaccgccggggacgg cctgacgacc 8640 ggggacgcca ccgtcggcgg cacctctgga gacgccgccctcggcagcgc cctcgcgacc 8700 gccctcggct cgggcgagcc gcagctcgcc ctccgggacggggcgctcct cgtaccccgc 8760 ctggcgcggg ccgccgcgcc cgccgcggcc gacggcctcgccgcggccga cggcctcgcc 8820 gctctgccgc tgcccgccgc tccggccctc tggcgtctggagcccggtac ggacggcagc 8880 ctggagagcc tcacggcggc gcccggcgac gccgagaccctcgccccgga gccgctcggc 8940 ccgggacagg tccgcatcgc gatccgggcc accggtctcaacttccgcga cgtcctgatc 9000 gccctcggca tgtaccccga tccggcgctg atgggcaccgagggagccgg cgtggtcacc 9060 gcgaccggcc ccggcgtcac gcacctcgcc cccggcgaccgggtcatggg cctgctctcc 9120 ggcgcgtacg ccccggtcgt cgtggcggac gcgcggaccgtcgcgcggat gcccgagggg 9180 tggacgttcg cccagggcgc ctccgtgccg gtggtgttcctgacggccgt ctacgccctg 9240 cgcgacctgg cggacgtcaa gcccggcgag cgcctcctggtccactccgc cgccggtggc 9300 gtgggcatgg ccgccgtgca gctcgcccgg cactggggcgtggaggtcca cggcacggcg 9360 agtcacggga agtgggacgc cctgcgcgcg ctcggcctggacgacgcgca catcgcctcc 9420 tcccgcaccc tggacttcga gtccgcgttc cgtgccgcttccggcggggc gggcatggac 9480 gtcgtactga actcgctcgc ccgcgagttc gtcgacgcctcgctgcgcct gctcgggccg 9540 ggcggccggt tcgtggagat ggggaagacc gacgtccgcgacgcggagcg ggtcgccgcc 9600 gaccaccccg gtgtcggcta ccgcgccttc gacctgggcgaggccgggcc ggagcggatc 9660 ggcgagatgc tcgccgaggt catcgccctc ttcgaggacggggtgctccg gcacctgccc 9720 gtcacgacct gggacgtgcg ccgggcccgc gacgccttccggcacgtcag ccaggcccgc 9780 cacacgggca aggtcgtcct cacgatgccg tcgggcctcgacccggaggg tacggtcctg 9840 ctgaccggcg gcaccggtgc gctggggggc atcgtggcccggcacgtggt gggcgagtgg 9900 ggcgtacgac gcctgctgct cgtgagccgg cggggcacggacgccccggg cgccggcgag 9960 ctcgtgcacg agctggaggc cctgggagcc gacgtctcggtggccgcgtg cgacgtcgcc 10020 gaccgcgaag ccctcaccgc cgtactcgac tcgatccccgccgaacaccc gctcaccgcg 10080 gtcgtccaca cggcaggcgt cctctccgac ggcaccctcccctcgatgac agcggaggat 10140 gtggaacacg tactgcgtcc caaggtcgac gccgcgttcctcctcgacga actcacctcg 10200 acgcccggct acgacctggc agcgttcgtc atgttctcctccgccgccgc cgtcttcggt 10260 ggcgcggggc agggcgccta cgccgccgcc aacgccaccctcgacgccct cgcctggcgc 10320 cgccggacag ccggactccc cgccctctcc ctcggctggggcctctgggc cgagaccagc 10380 ggcatgaccg gcggactcag cgacaccgac cgctcgcggctggcccgttc cggggcgacg 10440 cccatggaca gcgagctgac cctgtccctc ctggacgcggccatgcgccg cgacgacccg 10500 gcgctcgtcc cgatcgccct ggacgtcgcc gcgctccgcgcccagcagcg cgacggcatg 10560 ctggcgccgc tgctcagcgg gctcacccgc ggatcgcgggtcggcggcgc gccggtcaac 10620 cagcgcaggg cagccgccgg aggcgcgggc gaggcggacacggacctcgg cgggcggctc 10680 gccgcgatga caccggacga ccgggtcgcg cacctgcgggacctcgtccg tacgcacgtg 10740 gcgaccgtcc tgggacacgg caccccgagc cgggtggacctggagcgggc cttccgcgac 10800 accggtttcg actcgctcac cgccgtcgaa ctccgcaaccgtctcaacgc cgcgaccggg 10860 ctgcggctgc cggccacgct ggtcttcgac caccccaccccgggggagct cgccgggcac 10920 ctgctcgacg aactcgccac ggccgcgggc gggtcctgggcggaaggcac cgggtccgga 10980 gacacggcct cggcgaccga tcggcagacc acggcggccctcgccgaact cgaccggctg 11040 gaaggcgtgc tcgcctccct cgcgcccgcc gccggcggccgtccggagct cgccgcccgg 11100 ctcagggcgc tggccgcggc cctgggggac gacggcgacgacgccaccga cctggacgag 11160 gcgtccgacg acgacctctt ctccttcatc gacaaggagctgggcgactc cgacttctga 11220 33 3739 PRT Streptomyces venezuelae 33 MetSer Thr Val Asn Glu Glu Lys Tyr Leu Asp Tyr Leu Arg Arg Ala 1 5 10 15Thr Ala Asp Leu His Glu Ala Arg Gly Arg Leu Arg Glu Leu Glu Ala 20 25 30Lys Ala Gly Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro 35 40 45Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly 50 55 60Glu Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu 65 70 7580 Gly Leu Tyr Asp Pro Asn Pro Glu Ala Thr Gly Lys Ser Tyr Ala Arg 85 9095 Glu Ala Gly Phe Leu Tyr Glu Ala Gly Glu Phe Asp Ala Asp Phe Phe 100105 110 Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu115 120 125 Leu Leu Glu Ala Ser Trp Glu Ala Phe Glu His Ala Gly Ile ProAla 130 135 140 Ala Thr Ala Arg Gly Thr Ser Val Gly Val Phe Thr Gly ValMet Tyr 145 150 155 160 His Asp Tyr Ala Thr Arg Leu Thr Asp Val Pro GluGly Ile Glu Gly 165 170 175 Tyr Leu Gly Thr Gly Asn Ser Gly Ser Val AlaSer Gly Arg Val Ala 180 185 190 Tyr Thr Leu Gly Leu Glu Gly Pro Ala ValThr Val Asp Thr Ala Cys 195 200 205 Ser Ser Ser Leu Val Ala Leu His LeuAla Val Gln Ala Leu Arg Lys 210 215 220 Gly Glu Val Asp Met Ala Leu AlaGly Gly Val Thr Val Met Ser Thr 225 230 235 240 Pro Ser Thr Phe Val GluPhe Ser Arg Gln Arg Gly Leu Ala Pro Asp 245 250 255 Gly Arg Ser Lys SerPhe Ser Ser Thr Ala Asp Gly Thr Ser Trp Ser 260 265 270 Glu Gly Val GlyVal Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg 275 280 285 Lys Gly HisArg Ile Leu Ala Val Val Arg Gly Thr Ala Val Asn Gln 290 295 300 Asp GlyAla Ser Ser Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln 305 310 315 320Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ser Asp 325 330335 Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro 340345 350 Ile Glu Ala Gln Ala Val Ile Ala Thr Tyr Gly Gln Gly Arg Asp Gly355 360 365 Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly HisThr 370 375 380 Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val GlnAla Met 385 390 395 400 Arg His Gly Val Leu Pro Lys Thr Leu His Val GluLys Pro Thr Asp 405 410 415 Gln Val Asp Trp Ser Ala Gly Ala Val Glu LeuLeu Thr Glu Ala Met 420 425 430 Asp Trp Pro Asp Lys Gly Asp Gly Gly LeuArg Arg Ala Ala Val Ser 435 440 445 Ser Phe Gly Val Ser Gly Thr Asn AlaHis Val Val Leu Glu Glu Ala 450 455 460 Pro Ala Ala Glu Glu Thr Pro AlaSer Glu Ala Thr Pro Ala Val Glu 465 470 475 480 Pro Ser Val Gly Ala GlyLeu Val Pro Trp Leu Val Ser Ala Lys Thr 485 490 495 Pro Ala Ala Leu AspAla Gln Ile Gly Arg Leu Ala Ala Phe Ala Ser 500 505 510 Gln Gly Arg ThrAsp Ala Ala Asp Pro Gly Ala Val Ala Arg Val Leu 515 520 525 Ala Gly GlyArg Ala Glu Phe Glu His Arg Ala Val Val Leu Gly Thr 530 535 540 Gly GlnAsp Asp Phe Ala Gln Ala Leu Thr Ala Pro Glu Gly Leu Ile 545 550 555 560Arg Gly Thr Pro Ser Asp Val Gly Arg Val Ala Phe Val Phe Pro Gly 565 570575 Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Val Ser 580585 590 Lys Glu Phe Ala Ala Ala Met Ala Glu Cys Glu Ser Ala Leu Ser Arg595 600 605 Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro GlyAla 610 615 620 Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr PheAla Val 625 630 635 640 Met Val Ser Leu Ala Lys Val Trp Gln His His GlyVal Thr Pro Gln 645 650 655 Ala Val Val Gly His Ser Gln Gly Glu Ile AlaAla Ala Tyr Val Ala 660 665 670 Gly Ala Leu Thr Leu Asp Asp Ala Ala ArgVal Val Thr Leu Arg Ser 675 680 685 Lys Ser Ile Ala Ala His Leu Ala GlyLys Gly Gly Met Ile Ser Leu 690 695 700 Ala Leu Ser Glu Glu Ala Thr ArgGln Arg Ile Glu Asn Leu His Gly 705 710 715 720 Leu Ser Ile Ala Ala ValAsn Gly Pro Thr Ala Thr Val Val Ser Gly 725 730 735 Asp Pro Thr Gln IleGln Glu Leu Ala Gln Ala Cys Glu Ala Asp Gly 740 745 750 Val Arg Ala ArgIle Ile Pro Val Asp Tyr Ala Ser His Ser Ala His 755 760 765 Val Glu ThrIle Glu Ser Glu Leu Ala Glu Val Leu Ala Gly Leu Ser 770 775 780 Pro ArgThr Pro Glu Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp 785 790 795 800Ile Thr Glu Pro Val Leu Asp Gly Thr Tyr Trp Tyr Arg Asn Leu Arg 805 810815 His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu 820825 830 Gly Phe Thr His Phe Ile Glu Val Ser Ala His Pro Val Leu Thr Met835 840 845 Thr Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg GluGln 850 855 860 Gly Gly Gln Glu Arg Leu Val Thr Ser Leu Ala Glu Ala TrpThr Asn 865 870 875 880 Gly Leu Thr Ile Asp Trp Ala Pro Val Leu Pro ThrAla Thr Gly His 885 890 895 His Pro Glu Leu Pro Thr Tyr Ala Phe Gln ArgArg His Tyr Trp Leu 900 905 910 His Asp Ser Pro Ala Val Gln Gly Ser ValGln Asp Ser Trp Arg Tyr 915 920 925 Arg Ile Asp Trp Lys Arg Leu Ala ValAla Asp Ala Ser Glu Arg Ala 930 935 940 Gly Leu Ser Gly Arg Trp Leu ValVal Val Pro Glu Asp Arg Ser Ala 945 950 955 960 Glu Ala Ala Pro Val LeuAla Ala Leu Ser Gly Ala Gly Ala Asp Pro 965 970 975 Val Gln Leu Asp ValSer Pro Leu Gly Asp Arg Gln Arg Leu Ala Ala 980 985 990 Thr Leu Gly GluAla Leu Ala Ala Ala Gly Gly Ala Val Asp Gly Val 995 1000 1005 Leu SerLeu Leu Ala Trp Asp Glu Ser Ala His Pro Gly His Pro Ala 1010 1015 1020Pro Phe Thr Arg Gly Thr Gly Ala Thr Leu Thr Leu Val Gln Ala Leu 10251030 1035 1040 Glu Asp Ala Gly Val Ala Ala Pro Leu Trp Cys Val Thr HisGly Ala 1045 1050 1055 Val Ser Val Gly Arg Ala Asp His Val Thr Ser ProAla Gln Ala Met 1060 1065 1070 Val Trp Gly Met Gly Arg Val Ala Ala LeuGlu His Pro Glu Arg Trp 1075 1080 1085 Gly Gly Leu Ile Asp Leu Pro SerAsp Ala Asp Arg Ala Ala Leu Asp 1090 1095 1100 Arg Met Thr Thr Val LeuAla Gly Gly Thr Gly Glu Asp Gln Val Ala 1105 1110 1115 1120 Val Arg AlaSer Gly Leu Leu Ala Arg Arg Leu Val Arg Ala Ser Leu 1125 1130 1135 ProAla His Gly Thr Ala Ser Pro Trp Trp Gln Ala Asp Gly Thr Val 1140 11451150 Leu Val Thr Gly Ala Glu Glu Pro Ala Ala Ala Glu Ala Ala Arg Arg1155 1160 1165 Leu Ala Arg Asp Gly Ala Gly His Leu Leu Leu His Thr ThrPro Ser 1170 1175 1180 Gly Ser Glu Gly Ala Glu Gly Thr Ser Gly Ala AlaGlu Asp Ser Gly 1185 1190 1195 1200 Leu Ala Gly Leu Val Ala Glu Leu AlaAsp Leu Gly Ala Thr Ala Thr 1205 1210 1215 Val Val Thr Cys Asp Leu ThrAsp Ala Glu Ala Ala Ala Arg Leu Leu 1220 1225 1230 Ala Gly Val Ser AspAla His Pro Leu Ser Ala Val Leu His Leu Pro 1235 1240 1245 Pro Thr ValAsp Ser Glu Pro Leu Ala Ala Thr Asp Ala Asp Ala Leu 1250 1255 1260 AlaArg Val Val Thr Ala Lys Ala Thr Ala Ala Leu His Leu Asp Arg 1265 12701275 1280 Leu Leu Arg Glu Ala Ala Ala Ala Gly Gly Arg Pro Pro Val LeuVal 1285 1290 1295 Leu Phe Ser Ser Val Ala Ala Ile Trp Gly Gly Ala GlyGln Gly Ala 1300 1305 1310 Tyr Ala Ala Gly Thr Ala Phe Leu Asp Ala LeuAla Gly Gln His Arg 1315 1320 1325 Ala Asp Gly Pro Thr Val Thr Ser ValAla Trp Ser Pro Trp Glu Gly 1330 1335 1340 Ser Arg Val Thr Glu Gly AlaThr Gly Glu Arg Leu Arg Arg Leu Gly 1345 1350 1355 1360 Leu Arg Pro LeuAla Pro Ala Thr Ala Leu Thr Ala Leu Asp Thr Ala 1365 1370 1375 Leu GlyHis Gly Asp Thr Ala Val Thr Ile Ala Asp Val Asp Trp Ser 1380 1385 1390Ser Phe Ala Pro Gly Phe Thr Thr Ala Arg Pro Gly Thr Leu Leu Ala 13951400 1405 Asp Leu Pro Glu Ala Arg Arg Ala Leu Asp Glu Gln Gln Ser ThrThr 1410 1415 1420 Ala Ala Asp Asp Thr Val Leu Ser Arg Glu Leu Gly AlaLeu Thr Gly 1425 1430 1435 1440 Ala Glu Gln Gln Arg Arg Met Gln Glu LeuVal Arg Glu His Leu Ala 1445 1450 1455 Val Val Leu Asn His Pro Ser ProGlu Ala Val Asp Thr Gly Arg Ala 1460 1465 1470 Phe Arg Asp Leu Gly PheAsp Ser Leu Thr Ala Val Glu Leu Arg Asn 1475 1480 1485 Arg Leu Lys AsnAla Thr Gly Leu Ala Leu Pro Ala Thr Leu Val Phe 1490 1495 1500 Asp TyrPro Thr Pro Arg Thr Leu Ala Glu Phe Leu Leu Ala Glu Ile 1505 1510 15151520 Leu Gly Glu Gln Ala Gly Ala Gly Glu Gln Leu Pro Val Asp Gly Gly1525 1530 1535 Val Asp Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys ArgLeu Pro 1540 1545 1550 Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg LeuVal Ala Gly Gly 1555 1560 1565 Glu Asp Ala Ile Ser Gly Phe Pro Gln AspArg Gly Trp Asp Val Glu 1570 1575 1580 Gly Leu Tyr Asp Pro Asp Pro AspAla Ser Gly Arg Thr Tyr Cys Arg 1585 1590 1595 1600 Ala Gly Gly Phe LeuAsp Glu Ala Gly Glu Phe Asp Ala Asp Phe Phe 1605 1610 1615 Gly Ile SerPro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu 1620 1625 1630 LeuLeu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp Pro 1635 16401645 Thr Ser Leu Gln Gly Gln Gln Val Gly Val Phe Ala Gly Thr Asn Gly1650 1655 1660 Pro His Tyr Glu Pro Leu Leu Arg Asn Thr Ala Glu Asp LeuGlu Gly 1665 1670 1675 1680 Tyr Val Gly Thr Gly Asn Ala Ala Ser Ile MetSer Gly Arg Val Ser 1685 1690 1695 Tyr Thr Leu Gly Leu Glu Gly Pro AlaVal Thr Val Asp Thr Ala Cys 1700 1705 1710 Ser Ser Ser Leu Val Ala LeuHis Leu Ala Val Gln Ala Leu Arg Lys 1715 1720 1725 Gly Glu Cys Gly LeuAla Leu Ala Gly Gly Val Thr Val Met Ser Thr 1730 1735 1740 Pro Thr ThrPhe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Glu Asp 1745 1750 1755 1760Gly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Phe Gly Pro Ala 17651770 1775 Glu Gly Val Gly Met Leu Leu Val Glu Arg Leu Ser Asp Ala ArgArg 1780 1785 1790 Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser AlaVal Asn Gln 1795 1800 1805 Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro AsnGly Pro Ser Gln Gln 1810 1815 1820 Arg Val Ile Arg Arg Ala Leu Ala AspAla Arg Leu Thr Thr Ala Asp 1825 1830 1835 1840 Val Asp Val Val Glu AlaHis Gly Thr Gly Thr Arg Leu Gly Asp Pro 1845 1850 1855 Ile Glu Ala GlnAla Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp Thr 1860 1865 1870 Glu GlnPro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr 1875 1880 1885Gln Ala Ala Ala Gly Val Ser Gly Ile Ile Lys Met Val Gln Ala Met 18901895 1900 Arg His Gly Val Leu Pro Lys Thr Leu His Val Asp Arg Pro SerAsp 1905 1910 1915 1920 Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu LeuThr Glu Ala Met 1925 1930 1935 Asp Trp Pro Arg Lys Gln Glu Gly Gly LeuArg Arg Ala Ala Val Ser 1940 1945 1950 Ser Phe Gly Ile Ser Gly Thr AsnAla His Ile Val Leu Glu Glu Ala 1955 1960 1965 Pro Val Asp Glu Asp AlaPro Ala Asp Glu Pro Ser Val Gly Gly Val 1970 1975 1980 Val Pro Trp LeuVal Ser Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln 1985 1990 1995 2000 IleGly Arg Leu Ala Ala Phe Ala Ser Gln Gly Arg Thr Asp Ala Ala 2005 20102015 Asp Pro Gly Ala Val Ala Arg Val Leu Ala Gly Gly Arg Ala Gln Phe2020 2025 2030 Glu His Arg Ala Val Ala Leu Gly Thr Gly Gln Asp Asp LeuAla Ala 2035 2040 2045 Ala Leu Ala Ala Pro Glu Gly Leu Val Arg Gly ValAla Ser Gly Val 2050 2055 2060 Gly Arg Val Ala Phe Val Phe Pro Gly GlnGly Thr Gln Trp Ala Gly 2065 2070 2075 2080 Met Gly Ala Glu Leu Leu AspVal Ser Lys Glu Phe Ala Ala Ala Met 2085 2090 2095 Ala Glu Cys Glu AlaAla Leu Ala Pro Tyr Val Asp Trp Ser Leu Glu 2100 2105 2110 Ala Val ValArg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp 2115 2120 2125 ValVal Gln Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Lys Val 2130 21352140 Trp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln2145 2150 2155 2160 Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu SerLeu Asp Asp 2165 2170 2175 Ala Ala Arg Val Val Thr Leu Arg Ser Lys SerIle Gly Ala His Leu 2180 2185 2190 Ala Gly Gln Gly Gly Met Leu Ser LeuAla Leu Ser Glu Ala Ala Val 2195 2200 2205 Val Glu Arg Leu Ala Gly PheAsp Gly Leu Ser Val Ala Ala Val Asn 2210 2215 2220 Gly Pro Thr Ala ThrVal Val Ser Gly Asp Pro Thr Gln Ile Gln Glu 2225 2230 2235 2240 Leu AlaGln Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro 2245 2250 2255Val Asp Tyr Ala Ser His Ser Ala His Val Glu Thr Ile Glu Ser Glu 22602265 2270 Leu Ala Asp Val Leu Ala Gly Leu Ser Pro Gln Thr Pro Gln ValPro 2275 2280 2285 Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr Glu ProAla Leu Asp 2290 2295 2300 Gly Gly Tyr Trp Tyr Arg Asn Leu Arg His ArgVal Gly Phe Ala Pro 2305 2310 2315 2320 Ala Val Glu Thr Leu Ala Thr AspGlu Gly Phe Thr His Phe Val Glu 2325 2330 2335 Val Ser Ala His Pro ValLeu Thr Met Ala Leu Pro Glu Thr Val Thr 2340 2345 2350 Gly Leu Gly ThrLeu Arg Arg Asp Asn Gly Gly Gln His Arg Leu Thr 2355 2360 2365 Thr SerLeu Ala Glu Ala Trp Ala Asn Gly Leu Thr Val Asp Trp Ala 2370 2375 2380Ser Leu Leu Pro Thr Thr Thr Thr His Pro Asp Leu Pro Thr Tyr Ala 23852390 2395 2400 Phe Gln Thr Glu Arg Tyr Trp Pro Gln Pro Asp Leu Ser AlaAla Gly 2405 2410 2415 Asp Ile Thr Ser Ala Gly Leu Gly Ala Ala Glu HisPro Leu Leu Gly 2420 2425 2430 Ala Ala Val Ala Leu Ala Asp Ser Asp GlyCys Leu Leu Thr Gly Ser 2435 2440 2445 Leu Ser Leu Arg Thr His Pro TrpLeu Ala Asp His Ala Val Ala Gly 2450 2455 2460 Thr Val Leu Leu Pro GlyThr Ala Phe Val Glu Leu Ala Phe Arg Ala 2465 2470 2475 2480 Gly Asp GlnVal Gly Cys Asp Leu Val Glu Glu Leu Thr Leu Asp Ala 2485 2490 2495 ProLeu Val Leu Pro Arg Arg Gly Ala Val Arg Val Gln Leu Ser Val 2500 25052510 Gly Ala Ser Asp Glu Ser Gly Arg Arg Thr Phe Gly Leu Tyr Ala His2515 2520 2525 Pro Glu Asp Ala Pro Gly Glu Ala Glu Trp Thr Arg His AlaThr Gly 2530 2535 2540 Val Leu Ala Ala Arg Ala Asp Arg Thr Ala Pro ValAla Asp Pro Glu 2545 2550 2555 2560 Ala Trp Pro Pro Pro Gly Ala Glu ProVal Asp Val Asp Gly Leu Tyr 2565 2570 2575 Glu Arg Phe Ala Ala Asn GlyTyr Gly Tyr Gly Pro Leu Phe Gln Gly 2580 2585 2590 Val Arg Gly Val TrpArg Arg Gly Asp Glu Val Phe Ala Asp Val Ala 2595 2600 2605 Leu Pro AlaGlu Val Ala Gly Ala Glu Gly Ala Arg Phe Gly Leu His 2610 2615 2620 ProAla Leu Leu Asp Ala Ala Val Gln Ala Ala Gly Ala Gly Arg Gly 2625 26302635 2640 Val Arg Arg Gly His Ala Ala Ala Val Arg Leu Glu Arg Asp LeuLeu 2645 2650 2655 Tyr Ala Val Gly Ala Thr Ala Leu Arg Val Arg Leu AlaPro Ala Gly 2660 2665 2670 Pro Asp Thr Val Ser Val Ser Ala Ala Asp SerSer Gly Gln Pro Val 2675 2680 2685 Phe Ala Ala Asp Ser Leu Thr Val LeuPro Val Asp Pro Ala Gln Leu 2690 2695 2700 Ala Ala Phe Ser Asp Pro ThrLeu Asp Ala Leu His Leu Leu Glu Trp 2705 2710 2715 2720 Thr Ala Trp AspGly Ala Ala Gln Ala Leu Pro Gly Ala Val Val Leu 2725 2730 2735 Gly GlyAsp Ala Asp Gly Leu Ala Ala Ala Leu Arg Ala Gly Gly Thr 2740 2745 2750Glu Val Leu Ser Phe Pro Asp Leu Thr Asp Leu Val Glu Ala Val Asp 27552760 2765 Arg Gly Glu Thr Pro Ala Pro Ala Thr Val Leu Val Ala Cys ProAla 2770 2775 2780 Ala Gly Pro Asp Gly Pro Glu His Val Arg Glu Ala LeuHis Gly Ser 2785 2790 2795 2800 Leu Ala Leu Met Gln Ala Trp Leu Ala AspGlu Arg Phe Thr Asp Gly 2805 2810 2815 Arg Leu Val Leu Val Thr Arg AspAla Val Ala Ala Arg Ser Gly Asp 2820 2825 2830 Gly Leu Arg Ser Thr GlyGln Ala Ala Val Trp Gly Leu Gly Arg Ser 2835 2840 2845 Ala Gln Thr GluSer Pro Gly Arg Phe Val Leu Leu Asp Leu Ala Gly 2850 2855 2860 Glu AlaArg Thr Ala Gly Asp Ala Thr Ala Gly Asp Gly Leu Thr Thr 2865 2870 28752880 Gly Asp Ala Thr Val Gly Gly Thr Ser Gly Asp Ala Ala Leu Gly Ser2885 2890 2895 Ala Leu Ala Thr Ala Leu Gly Ser Gly Glu Pro Gln Leu AlaLeu Arg 2900 2905 2910 Asp Gly Ala Leu Leu Val Pro Arg Leu Ala Arg AlaAla Ala Pro Ala 2915 2920 2925 Ala Ala Asp Gly Leu Ala Ala Ala Asp GlyLeu Ala Ala Leu Pro Leu 2930 2935 2940 Pro Ala Ala Pro Ala Leu Trp ArgLeu Glu Pro Gly Thr Asp Gly Ser 2945 2950 2955 2960 Leu Glu Ser Leu ThrAla Ala Pro Gly Asp Ala Glu Thr Leu Ala Pro 2965 2970 2975 Glu Pro LeuGly Pro Gly Gln Val Arg Ile Ala Ile Arg Ala Thr Gly 2980 2985 2990 LeuAsn Phe Arg Asp Val Leu Ile Ala Leu Gly Met Tyr Pro Asp Pro 2995 30003005 Ala Leu Met Gly Thr Glu Gly Ala Gly Val Val Thr Ala Thr Gly Pro3010 3015 3020 Gly Val Thr His Leu Ala Pro Gly Asp Arg Val Met Gly LeuLeu Ser 3025 3030 3035 3040 Gly Ala Tyr Ala Pro Val Val Val Ala Asp AlaArg Thr Val Ala Arg 3045 3050 3055 Met Pro Glu Gly Trp Thr Phe Ala GlnGly Ala Ser Val Pro Val Val 3060 3065 3070 Phe Leu Thr Ala Val Tyr AlaLeu Arg Asp Leu Ala Asp Val Lys Pro 3075 3080 3085 Gly Glu Arg Leu LeuVal His Ser Ala Ala Gly Gly Val Gly Met Ala 3090 3095 3100 Ala Val GlnLeu Ala Arg His Trp Gly Val Glu Val His Gly Thr Ala 3105 3110 3115 3120Ser His Gly Lys Trp Asp Ala Leu Arg Ala Leu Gly Leu Asp Asp Ala 31253130 3135 His Ile Ala Ser Ser Arg Thr Leu Asp Phe Glu Ser Ala Phe ArgAla 3140 3145 3150 Ala Ser Gly Gly Ala Gly Met Asp Val Val Leu Asn SerLeu Ala Arg 3155 3160 3165 Glu Phe Val Asp Ala Ser Leu Arg Leu Leu GlyPro Gly Gly Arg Phe 3170 3175 3180 Val Glu Met Gly Lys Thr Asp Val ArgAsp Ala Glu Arg Val Ala Ala 3185 3190 3195 3200 Asp His Pro Gly Val GlyTyr Arg Ala Phe Asp Leu Gly Glu Ala Gly 3205 3210 3215 Pro Glu Arg IleGly Glu Met Leu Ala Glu Val Ile Ala Leu Phe Glu 3220 3225 3230 Asp GlyVal Leu Arg His Leu Pro Val Thr Thr Trp Asp Val Arg Arg 3235 3240 3245Ala Arg Asp Ala Phe Arg His Val Ser Gln Ala Arg His Thr Gly Lys 32503255 3260 Val Val Leu Thr Met Pro Ser Gly Leu Asp Pro Glu Gly Thr ValLeu 3265 3270 3275 3280 Leu Thr Gly Gly Thr Gly Ala Leu Gly Gly Ile ValAla Arg His Val 3285 3290 3295 Val Gly Glu Trp Gly Val Arg Arg Leu LeuLeu Val Ser Arg Arg Gly 3300 3305 3310 Thr Asp Ala Pro Gly Ala Gly GluLeu Val His Glu Leu Glu Ala Leu 3315 3320 3325 Gly Ala Asp Val Ser ValAla Ala Cys Asp Val Ala Asp Arg Glu Ala 3330 3335 3340 Leu Thr Ala ValLeu Asp Ser Ile Pro Ala Glu His Pro Leu Thr Ala 3345 3350 3355 3360 ValVal His Thr Ala Gly Val Leu Ser Asp Gly Thr Leu Pro Ser Met 3365 33703375 Thr Ala Glu Asp Val Glu His Val Leu Arg Pro Lys Val Asp Ala Ala3380 3385 3390 Phe Leu Leu Asp Glu Leu Thr Ser Thr Pro Gly Tyr Asp LeuAla Ala 3395 3400 3405 Phe Val Met Phe Ser Ser Ala Ala Ala Val Phe GlyGly Ala Gly Gln 3410 3415 3420 Gly Ala Tyr Ala Ala Ala Asn Ala Thr LeuAsp Ala Leu Ala Trp Arg 3425 3430 3435 3440 Arg Arg Thr Ala Gly Leu ProAla Leu Ser Leu Gly Trp Gly Leu Trp 3445 3450 3455 Ala Glu Thr Ser GlyMet Thr Gly Gly Leu Ser Asp Thr Asp Arg Ser 3460 3465 3470 Arg Leu AlaArg Ser Gly Ala Thr Pro Met Asp Ser Glu Leu Thr Leu 3475 3480 3485 SerLeu Leu Asp Ala Ala Met Arg Arg Asp Asp Pro Ala Leu Val Pro 3490 34953500 Ile Ala Leu Asp Val Ala Ala Leu Arg Ala Gln Gln Arg Asp Gly Met3505 3510 3515 3520 Leu Ala Pro Leu Leu Ser Gly Leu Thr Arg Gly Ser ArgVal Gly Gly 3525 3530 3535 Ala Pro Val Asn Gln Arg Arg Ala Ala Ala GlyGly Ala Gly Glu Ala 3540 3545 3550 Asp Thr Asp Leu Gly Gly Arg Leu AlaAla Met Thr Pro Asp Asp Arg 3555 3560 3565 Val Ala His Leu Arg Asp LeuVal Arg Thr His Val Ala Thr Val Leu 3570 3575 3580 Gly His Gly Thr ProSer Arg Val Asp Leu Glu Arg Ala Phe Arg Asp 3585 3590 3595 3600 Thr GlyPhe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Asn 3605 3610 3615Ala Ala Thr Gly Leu Arg Leu Pro Ala Thr Leu Val Phe Asp His Pro 36203625 3630 Thr Pro Gly Glu Leu Ala Gly His Leu Leu Asp Glu Leu Ala ThrAla 3635 3640 3645 Ala Gly Gly Ser Trp Ala Glu Gly Thr Gly Ser Gly AspThr Ala Ser 3650 3655 3660 Ala Thr Asp Arg Gln Thr Thr Ala Ala Leu AlaGlu Leu Asp Arg Leu 3665 3670 3675 3680 Glu Gly Val Leu Ala Ser Leu AlaPro Ala Ala Gly Gly Arg Pro Glu 3685 3690 3695 Leu Ala Ala Arg Leu ArgAla Leu Ala Ala Ala Leu Gly Asp Asp Gly 3700 3705 3710 Asp Asp Ala ThrAsp Leu Asp Glu Ala Ser Asp Asp Asp Leu Phe Ser 3715 3720 3725 Phe IleAsp Lys Glu Leu Gly Asp Ser Asp Phe 3730 3735 34 4689 DNA Streptomycesvenezuelae 34 atggcgaaca acgaagacaa gctccgcgac tacctcaagc gcgtcaccgccgagctgcag 60 cagaacacca ggcgtctgcg cgagatcgag ggacgcacgc acgagccggtggcgatcgtg 120 ggcatggcct gccgcctgcc gggcggtgtc gcctcgcccg aggacctgtggcagctggtg 180 gccggggacg gggacgcgat ctcggagttc ccgcaggacc gcggctgggacgtggagggg 240 ctgtacgacc ccgacccgga cgcgtccggc aggacgtact gccggtccggcggattcctg 300 cacgacgccg gcgagttcga cgccgacttc ttcgggatct cgccgcgcgaggccctcgcc 360 atggacccgc agcagcgact gtccctcacc accgcgtggg aggcgatcgagagcgcgggc 420 atcgacccga cggccctgaa gggcagcggc ctcggcgtct tcgtcggcggctggcacacc 480 ggctacacct cggggcagac caccgccgtg cagtcgcccg agctggagggccacctggtc 540 agcggcgcgg cgctgggctt cctgtccggc cgtatcgcgt acgtcctcggtacggacgga 600 ccggccctga ccgtggacac ggcctgctcg tcctcgctgg tcgccctgcacctcgccgtg 660 caggccctcc gcaagggcga gtgcgacatg gccctcgccg gtggtgtcacggtcatgccc 720 aacgcggacc tgttcgtgca gttcagccgg cagcgcgggc tggccgcggacggccggtcg 780 aaggcgttcg ccacctcggc ggacggcttc ggccccgcgg agggcgccggagtcctgctg 840 gtggagcgcc tgtcggacgc ccgccgcaac ggacaccgga tcctcgcggtcgtccgcggc 900 agcgcggtca accaggacgg cgccagcaac ggcctcacgg ctccgcacgggccctcccag 960 cagcgcgtca tccgacgggc cctggcggac gcccggctcg cgccgggtgacgtggacgtc 1020 gtcgaggcgc acggcacggg cacgcggctc ggcgacccga tcgaggcgcaggccctcatc 1080 gccacctacg gccaggagaa gagcagcgaa cagccgctga ggctgggcgcgttgaagtcg 1140 aacatcgggc acacgcaggc cgcggccggt gtcgcaggtg tcatcaagatggtccaggcg 1200 atgcgccacg gactgctgcc gaagacgctg cacgtcgacg agccctcggaccagatcgac 1260 tggtcggcgg gcacggtgga actcctcacc gaggccgtcg actggccggagaagcaggac 1320 ggcgggctgc gccgcgcggc tgtctcctcc ttcggcatca gcgggacgaacgcgcacgtc 1380 gtcctggagg aggccccggc ggtcgaggac tccccggccg tcgagccgccggccggtggc 1440 ggtgtggtgc cgtggccggt gtccgcgaag actccggccg cgctggacgcccagatcggg 1500 cagctcgccg cgtacgcgga cggtcgtacg gacgtggatc cggcggtggccgcccgcgcc 1560 ctggtcgaca gccgtacggc gatggagcac cgcgcggtcg cggtcggcgacagccgggag 1620 gcactgcggg acgccctgcg gatgccggaa ggactggtac gcggcacgtcctcggacgtg 1680 ggccgggtgg cgttcgtctt ccccggccag ggcacgcagt gggccggcatgggcgccgaa 1740 ctccttgaca gctcaccgga gttcgctgcc tcgatggccg aatgcgagaccgcgctctcc 1800 cgctacgtcg actggtctct tgaagccgtc gtccgacagg aacccggcgcacccacgctc 1860 gaccgcgtcg acgtcgtcca gcccgtgacc ttcgctgtca tggtctcgctggcgaaggtc 1920 tggcagcacc acggcatcac cccccaggcc gtcgtcggcc actcgcagggcgagatcgcc 1980 gccgcgtacg tcgccggtgc actcaccctc gacgacgccg cccgcgtcgtcaccctgcgc 2040 agcaagtcca tcgccgccca cctcgccggc aagggcggca tgatctccctcgccctcgac 2100 gaggcggccg tcctgaagcg actgagcgac ttcgacggac tctccgtcgccgccgtcaac 2160 ggccccaccg ccaccgtcgt ctccggcgac ccgacccaga tcgaggaactcgcccgcacc 2220 tgcgaggccg acggcgtccg tgcgcggatc atcccggtcg actacgcctcccacagccgg 2280 caggtcgaga tcatcgagaa ggagctggcc gaggtcctcg ccggactcgccccgcaggct 2340 ccgcacgtgc cgttcttctc caccctcgaa ggcacctgga tcaccgagccggtgctcgac 2400 ggcacctact ggtaccgcaa cctgcgccat cgcgtgggct tcgcccccgccgtggagacc 2460 ttggcggttg acggcttcac ccacttcatc gaggtcagcg cccaccccgtcctcaccatg 2520 accctccccg agaccgtcac cggcctcggc accctccgcc gcgaacagggaggccaggag 2580 cgtctggtca cctcactcgc cgaagcctgg gccaacggcc tcaccatcgactgggcgccc 2640 atcctcccca ccgcaaccgg ccaccacccc gagctcccca cctacgccttccagaccgag 2700 cgcttctggc tgcagagctc cgcgcccacc agcgccgccg acgactggcgttaccgcgtc 2760 gagtggaagc cgctgacggc ctccggccag gcggacctgt ccgggcggtggatcgtcgcc 2820 gtcgggagcg agccagaagc cgagctgctg ggcgcgctga aggccgcgggagcggaggtc 2880 gacgtactgg aagccggggc ggacgacgac cgtgaggccc tcgccgcccggctcaccgca 2940 ctgacgaccg gcgacggctt caccggcgtg gtctcgctcc tcgacgacctcgtgccacag 3000 gtcgcctggg tgcaggcact cggcgacgcc ggaatcaagg cgcccctgtggtccgtcacc 3060 cagggcgcgg tctccgtcgg acgtctcgac acccccgccg accccgaccgggccatgctc 3120 tggggcctcg gccgcgtcgt cgcccttgag caccccgaac gctgggccggcctcgtcgac 3180 ctccccgccc agcccgatgc cgccgccctc gcccacctcg tcaccgcactctccggcgcc 3240 accggcgagg accagatcgc catccgcacc accggactcc acgcccgccgcctcgcccgc 3300 gcacccctcc acggacgtcg gcccacccgc gactggcagc cccacggcaccgtcctcatc 3360 accggcggca ccggagccct cggcagccac gccgcacgct ggatggcccaccacggagcc 3420 gaacacctcc tcctcgtcag ccgcagcggc gaacaagccc ccggagccacccaactcacc 3480 gccgaactca ccgcatcggg cgcccgcgtc accatcgccg cctgcgacgtcgccgacccc 3540 cacgccatgc gcaccctcct cgacgccatc cccgccgaga cgcccctcaccgccgtcgtc 3600 cacaccgccg gcgcaccggg cggcgatccg ctggacgtca ccggcccggaggacatcgcc 3660 cgcatcctgg gcgcgaagac gagcggcgcc gaggtcctcg acgacctgctccgcggcact 3720 ccgctggacg ccttcgtcct ctactcctcg aacgccgggg tctggggcagcggcagccag 3780 ggcgtctacg cggcggccaa cgcccacctc gacgcgctcg ccgcccggcgccgcgcccgg 3840 ggcgagacgg cgacctcggt cgcctggggc ctctgggccg gcgacggcatgggccggggc 3900 gccgacgacg cgtactggca gcgtcgcggc atccgtccga tgagccccgaccgcgccctg 3960 gacgaactgg ccaaggccct gagccacgac gagaccttcg tcgccgtggccgatgtcgac 4020 tgggagcggt tcgcgcccgc gttcacggtg tcccgtccca gccttctgctcgacggcgtc 4080 ccggaggccc ggcaggcgct cgccgcaccc gtcggtgccc cggctcccggcgacgccgcc 4140 gtggcgccga ccgggcagtc gtcggcgctg gccgcgatca ccgcgctccccgagcccgag 4200 cgccggccgg cgctcctcac cctcgtccgt acccacgcgg cggccgtactcggccattcc 4260 tcccccgacc gggtggcccc cggccgtgcc ttcaccgagc tcggcttcgactcgctgacg 4320 gccgtgcagc tccgcaacca gctctccacg gtggtcggca acaggctccccgccaccacg 4380 gtcttcgacc acccgacgcc cgccgcactc gccgcgcacc tccacgaggcgtacctcgca 4440 ccggccgagc cggccccgac ggactgggag gggcgggtgc gccgggccctggccgaactg 4500 cccctcgacc ggctgcggga cgcgggggtc ctcgacaccg tcctgcgcctcaccggcatc 4560 gagcccgagc cgggttccgg cggttcggac ggcggcgccg ccgaccctggtgcggagccg 4620 gaggcgtcga tcgacgacct ggacgccgag gccctgatcc ggatggctctcggcccccgt 4680 aacacctga 4689 35 1562 PRT Streptomyces venezuelae 35Met Ala Asn Asn Glu Asp Lys Leu Arg Asp Tyr Leu Lys Arg Val Thr 1 5 1015 Ala Glu Leu Gln Gln Asn Thr Arg Arg Leu Arg Glu Ile Glu Gly Arg 20 2530 Thr His Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly 35 4045 Gly Val Ala Ser Pro Glu Asp Leu Trp Gln Leu Val Ala Gly Asp Gly 50 5560 Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly 65 7075 80 Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys Arg Ser 8590 95 Gly Gly Phe Leu His Asp Ala Gly Glu Phe Asp Ala Asp Phe Phe Gly100 105 110 Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg LeuSer 115 120 125 Leu Thr Thr Ala Trp Glu Ala Ile Glu Ser Ala Gly Ile AspPro Thr 130 135 140 Ala Leu Lys Gly Ser Gly Leu Gly Val Phe Val Gly GlyTrp His Thr 145 150 155 160 Gly Tyr Thr Ser Gly Gln Thr Thr Ala Val GlnSer Pro Glu Leu Glu 165 170 175 Gly His Leu Val Ser Gly Ala Ala Leu GlyPhe Leu Ser Gly Arg Ile 180 185 190 Ala Tyr Val Leu Gly Thr Asp Gly ProAla Leu Thr Val Asp Thr Ala 195 200 205 Cys Ser Ser Ser Leu Val Ala LeuHis Leu Ala Val Gln Ala Leu Arg 210 215 220 Lys Gly Glu Cys Asp Met AlaLeu Ala Gly Gly Val Thr Val Met Pro 225 230 235 240 Asn Ala Asp Leu PheVal Gln Phe Ser Arg Gln Arg Gly Leu Ala Ala 245 250 255 Asp Gly Arg SerLys Ala Phe Ala Thr Ser Ala Asp Gly Phe Gly Pro 260 265 270 Ala Glu GlyAla Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg 275 280 285 Arg AsnGly His Arg Ile Leu Ala Val Val Arg Gly Ser Ala Val Asn 290 295 300 GlnAsp Gly Ala Ser Asn Gly Leu Thr Ala Pro His Gly Pro Ser Gln 305 310 315320 Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Ala Pro Gly 325330 335 Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp340 345 350 Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Glu LysSer 355 360 365 Ser Glu Gln Pro Leu Arg Leu Gly Ala Leu Lys Ser Asn IleGly His 370 375 380 Thr Gln Ala Ala Ala Gly Val Ala Gly Val Ile Lys MetVal Gln Ala 385 390 395 400 Met Arg His Gly Leu Leu Pro Lys Thr Leu HisVal Asp Glu Pro Ser 405 410 415 Asp Gln Ile Asp Trp Ser Ala Gly Thr ValGlu Leu Leu Thr Glu Ala 420 425 430 Val Asp Trp Pro Glu Lys Gln Asp GlyGly Leu Arg Arg Ala Ala Val 435 440 445 Ser Ser Phe Gly Ile Ser Gly ThrAsn Ala His Val Val Leu Glu Glu 450 455 460 Ala Pro Ala Val Glu Asp SerPro Ala Val Glu Pro Pro Ala Gly Gly 465 470 475 480 Gly Val Val Pro TrpPro Val Ser Ala Lys Thr Pro Ala Ala Leu Asp 485 490 495 Ala Gln Ile GlyGln Leu Ala Ala Tyr Ala Asp Gly Arg Thr Asp Val 500 505 510 Asp Pro AlaVal Ala Ala Arg Ala Leu Val Asp Ser Arg Thr Ala Met 515 520 525 Glu HisArg Ala Val Ala Val Gly Asp Ser Arg Glu Ala Leu Arg Asp 530 535 540 AlaLeu Arg Met Pro Glu Gly Leu Val Arg Gly Thr Ser Ser Asp Val 545 550 555560 Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly 565570 575 Met Gly Ala Glu Leu Leu Asp Ser Ser Pro Glu Phe Ala Ala Ser Met580 585 590 Ala Glu Cys Glu Thr Ala Leu Ser Arg Tyr Val Asp Trp Ser LeuGlu 595 600 605 Ala Val Val Arg Gln Glu Pro Gly Ala Pro Thr Leu Asp ArgVal Asp 610 615 620 Val Val Gln Pro Val Thr Phe Ala Val Met Val Ser LeuAla Lys Val 625 630 635 640 Trp Gln His His Gly Ile Thr Pro Gln Ala ValVal Gly His Ser Gln 645 650 655 Gly Glu Ile Ala Ala Ala Tyr Val Ala GlyAla Leu Thr Leu Asp Asp 660 665 670 Ala Ala Arg Val Val Thr Leu Arg SerLys Ser Ile Ala Ala His Leu 675 680 685 Ala Gly Lys Gly Gly Met Ile SerLeu Ala Leu Asp Glu Ala Ala Val 690 695 700 Leu Lys Arg Leu Ser Asp PheAsp Gly Leu Ser Val Ala Ala Val Asn 705 710 715 720 Gly Pro Thr Ala ThrVal Val Ser Gly Asp Pro Thr Gln Ile Glu Glu 725 730 735 Leu Ala Arg ThrCys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro 740 745 750 Val Asp TyrAla Ser His Ser Arg Gln Val Glu Ile Ile Glu Lys Glu 755 760 765 Leu AlaGlu Val Leu Ala Gly Leu Ala Pro Gln Ala Pro His Val Pro 770 775 780 PhePhe Ser Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val Leu Asp 785 790 795800 Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro 805810 815 Ala Val Glu Thr Leu Ala Val Asp Gly Phe Thr His Phe Ile Glu Val820 825 830 Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val ThrGly 835 840 845 Leu Gly Thr Leu Arg Arg Glu Gln Gly Gly Gln Glu Arg LeuVal Thr 850 855 860 Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu Thr Ile AspTrp Ala Pro 865 870 875 880 Ile Leu Pro Thr Ala Thr Gly His His Pro GluLeu Pro Thr Tyr Ala 885 890 895 Phe Gln Thr Glu Arg Phe Trp Leu Gln SerSer Ala Pro Thr Ser Ala 900 905 910 Ala Asp Asp Trp Arg Tyr Arg Val GluTrp Lys Pro Leu Thr Ala Ser 915 920 925 Gly Gln Ala Asp Leu Ser Gly ArgTrp Ile Val Ala Val Gly Ser Glu 930 935 940 Pro Glu Ala Glu Leu Leu GlyAla Leu Lys Ala Ala Gly Ala Glu Val 945 950 955 960 Asp Val Leu Glu AlaGly Ala Asp Asp Asp Arg Glu Ala Leu Ala Ala 965 970 975 Arg Leu Thr AlaLeu Thr Thr Gly Asp Gly Phe Thr Gly Val Val Ser 980 985 990 Leu Leu AspAsp Leu Val Pro Gln Val Ala Trp Val Gln Ala Leu Gly 995 1000 1005 AspAla Gly Ile Lys Ala Pro Leu Trp Ser Val Thr Gln Gly Ala Val 1010 10151020 Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala Met Leu1025 1030 1035 1040 Trp Gly Leu Gly Arg Val Val Ala Leu Glu His Pro GluArg Trp Ala 1045 1050 1055 Gly Leu Val Asp Leu Pro Ala Gln Pro Asp AlaAla Ala Leu Ala His 1060 1065 1070 Leu Val Thr Ala Leu Ser Gly Ala ThrGly Glu Asp Gln Ile Ala Ile 1075 1080 1085 Arg Thr Thr Gly Leu His AlaArg Arg Leu Ala Arg Ala Pro Leu His 1090 1095 1100 Gly Arg Arg Pro ThrArg Asp Trp Gln Pro His Gly Thr Val Leu Ile 1105 1110 1115 1120 Thr GlyGly Thr Gly Ala Leu Gly Ser His Ala Ala Arg Trp Met Ala 1125 1130 1135His His Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly Glu Gln 11401145 1150 Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu Leu Thr Ala Ser GlyAla 1155 1160 1165 Arg Val Thr Ile Ala Ala Cys Asp Val Ala Asp Pro HisAla Met Arg 1170 1175 1180 Thr Leu Leu Asp Ala Ile Pro Ala Glu Thr ProLeu Thr Ala Val Val 1185 1190 1195 1200 His Thr Ala Gly Ala Pro Gly GlyAsp Pro Leu Asp Val Thr Gly Pro 1205 1210 1215 Glu Asp Ile Ala Arg IleLeu Gly Ala Lys Thr Ser Gly Ala Glu Val 1220 1225 1230 Leu Asp Asp LeuLeu Arg Gly Thr Pro Leu Asp Ala Phe Val Leu Tyr 1235 1240 1245 Ser SerAsn Ala Gly Val Trp Gly Ser Gly Ser Gln Gly Val Tyr Ala 1250 1255 1260Ala Ala Asn Ala His Leu Asp Ala Leu Ala Ala Arg Arg Arg Ala Arg 12651270 1275 1280 Gly Glu Thr Ala Thr Ser Val Ala Trp Gly Leu Trp Ala GlyAsp Gly 1285 1290 1295 Met Gly Arg Gly Ala Asp Asp Ala Tyr Trp Gln ArgArg Gly Ile Arg 1300 1305 1310 Pro Met Ser Pro Asp Arg Ala Leu Asp GluLeu Ala Lys Ala Leu Ser 1315 1320 1325 His Asp Glu Thr Phe Val Ala ValAla Asp Val Asp Trp Glu Arg Phe 1330 1335 1340 Ala Pro Ala Phe Thr ValSer Arg Pro Ser Leu Leu Leu Asp Gly Val 1345 1350 1355 1360 Pro Glu AlaArg Gln Ala Leu Ala Ala Pro Val Gly Ala Pro Ala Pro 1365 1370 1375 GlyAsp Ala Ala Val Ala Pro Thr Gly Gln Ser Ser Ala Leu Ala Ala 1380 13851390 Ile Thr Ala Leu Pro Glu Pro Glu Arg Arg Pro Ala Leu Leu Thr Leu1395 1400 1405 Val Arg Thr His Ala Ala Ala Val Leu Gly His Ser Ser ProAsp Arg 1410 1415 1420 Val Ala Pro Gly Arg Ala Phe Thr Glu Leu Gly PheAsp Ser Leu Thr 1425 1430 1435 1440 Ala Val Gln Leu Arg Asn Gln Leu SerThr Val Val Gly Asn Arg Leu 1445 1450 1455 Pro Ala Thr Thr Val Phe AspHis Pro Thr Pro Ala Ala Leu Ala Ala 1460 1465 1470 His Leu His Glu AlaTyr Leu Ala Pro Ala Glu Pro Ala Pro Thr Asp 1475 1480 1485 Trp Glu GlyArg Val Arg Arg Ala Leu Ala Glu Leu Pro Leu Asp Arg 1490 1495 1500 LeuArg Asp Ala Gly Val Leu Asp Thr Val Leu Arg Leu Thr Gly Ile 1505 15101515 1520 Glu Pro Glu Pro Gly Ser Gly Gly Ser Asp Gly Gly Ala Ala AspPro 1525 1530 1535 Gly Ala Glu Pro Glu Ala Ser Ile Asp Asp Leu Asp AlaGlu Ala Leu 1540 1545 1550 Ile Arg Met Ala Leu Gly Pro Arg Asn Thr 15551560 36 4041 DNA Streptomyces venezuelae 36 atgacgagtt ccaacgaacagttggtggac gctctgcgcg cctctctcaa ggagaacgaa 60 gaactccgga aagagagccgtcgccgggcc gaccgtcggc aggagcccat ggcgatcgtc 120 ggcatgagct gccggttcgcgggcggaatc cggtcccccg aggacctctg ggacgccgtc 180 gccgcgggca aggacctggtctccgaggta ccggaggagc gcggctggga catcgactcc 240 ctctacgacc cggtgcccgggcgcaagggc acgacgtacg tccgcaacgc cgcgttcctc 300 gacgacgccg ccggattcgacgcggccttc ttcgggatct cgccgcgcga ggccctcgcc 360 atggacccgc agcagcggcagctcctcgaa gcctcctggg aggtcttcga gcgggccggc 420 atcgaccccg cgtcggtccgcggcaccgac gtcggcgtgt acgtgggctg tggctaccag 480 gactacgcgc cggacatccgggtcgccccc gaaggcaccg gcggttacgt cgtcaccggc 540 aactcctccg ccgtggcctccgggcgcatc gcgtactccc tcggcctgga gggacccgcc 600 gtgaccgtgg acacggcgtgctcctcttcg ctcgtcgccc tgcacctcgc cctgaagggc 660 ctgcggaacg gcgactgctcgacggcactc gtgggcggcg tggccgtcct cgcgacgccg 720 ggcgcgttca tcgagttcagcagccagcag gccatggccg ccgacggccg gaccaagggc 780 ttcgcctcgg cggcggacggcctcgcctgg ggcgagggcg tcgccgtact cctcctcgaa 840 cggctctccg acgcgcggcgcaagggccac cgggtcctgg ccgtcgtgcg cggcagcgcc 900 atcaaccagg acggcgcgagcaacggcctc acggctccgc acgggccctc ccagcagcac 960 ctgatccgcc aggccctggccgacgcgcgg ctcacgtcga gcgacgtgga cgtcgtggag 1020 ggccacggca cggggacccgtctcggcgac ccgatcgagg cgcaggcgct gctcgccacg 1080 tacgggcagg ggcgcgccccggggcagccg ctgcggctgg ggacgctgaa gtcgaacatc 1140 gggcacacgc aggccgcttcgggtgtcgcc ggtgtcatca agatggtgca ggcgctgcgc 1200 cacggggtgc tgccgaagaccctgcacgtg gacgagccga cggaccaggt cgactggtcg 1260 gccggttcgg tcgagctgctcaccgaggcc gtggactggc cggagcggcc gggccggctc 1320 cgccgggcgg gcgtctccgcgttcggcgtg ggcgggacga acgcgcacgt cgtcctggag 1380 gaggccccgg cggtcgaggagtcccctgcc gtcgagccgc cggccggtgg cggcgtggtg 1440 ccgtggccgg tgtccgcgaagacctcggcc gcactggacg cccagatcgg gcagctcgcc 1500 gcatacgcgg aagaccgcacggacgtggat ccggcggtgg ccgcccgcgc cctggtcgac 1560 agccgtacgg cgatggagcaccgcgcggtc gcggtcggcg acagccggga ggcactgcgg 1620 gacgccctgc ggatgccggaaggactggta cggggcacgg tcaccgatcc gggccgggtg 1680 gcgttcgtct tccccggccagggcacgcag tgggccggca tgggcgccga actcctcgac 1740 agctcacccg aattcgccgccgccatggcc gaatgcgaga ccgcactctc cccgtacgtc 1800 gactggtctc tcgaagccgtcgtccgacag gctcccagcg caccgacact cgaccgcgtc 1860 gacgtcgtcc agcccgtcaccttcgccgtc atggtctccc tcgccaaggt ctggcagcac 1920 cacggcatca cccccgaggccgtcatcggc cactcccagg gcgagatcgc cgccgcgtac 1980 gtcgccggtg ccctcaccctcgacgacgcc gctcgtgtcg tgaccctccg cagcaagtcc 2040 atcgccgccc acctcgccggcaagggcggc atgatctccc tcgccctcag cgaggaagcc 2100 acccggcagc gcatcgagaacctccacgga ctgtcgatcg ccgccgtcaa cgggcctacc 2160 gccaccgtgg tttcgggcgaccccacccag atccaagaac ttgctcaggc gtgtgaggcc 2220 gacggcatcc gcgcacggatcatccccgtc gactacgcct cccacagcgc ccacgtcgag 2280 accatcgaga acgaactcgccgacgtcctg gcggggttgt ccccccagac accccaggtc 2340 cccttcttct ccaccctcgaaggcacctgg atcaccgaac ccgccctcga cggcggctac 2400 tggtaccgca acctccgccatcgtgtgggc ttcgccccgg ccgtcgagac cctcgccacc 2460 gacgaaggct tcacccacttcatcgaggtc agcgcccacc ccgtcctcac catgaccctc 2520 cccgacaagg tcaccggcctggccaccctc cgacgcgagg acggcggaca gcaccgcctc 2580 accacctccc ttgccgaggcctgggccaac ggcctcgccc tcgactgggc ctccctcctg 2640 cccgccacgg gcgccctcagccccgccgtc cccgacctcc cgacgtacgc cttccagcac 2700 cgctcgtact ggatcagccccgcgggtccc ggcgaggcgc ccgcgcacac cgcttccggg 2760 cgcgaggccg tcgccgagacggggctcgcg tggggcccgg gtgccgagga cctcgacgag 2820 gagggccggc gcagcgccgtactcgcgatg gtgatgcggc aggcggcctc cgtgctccgg 2880 tgcgactcgc ccgaagaggtccccgtcgac cgcccgctgc gggagatcgg cttcgactcg 2940 ctgaccgccg tcgacttccgcaaccgcgtc aaccggctga ccggtctcca gctgccgccc 3000 accgtcgtgt tccagcacccgacgcccgtc gcgctcgccg agcgcatcag cgacgagctg 3060 gccgagcgga actgggccgtcgccgagccg tcggatcacg agcaggcgga ggaggagaag 3120 gccgccgctc cggcgggggcccgctccggg gccgacaccg gcgccggcgc cgggatgttc 3180 cgcgccctgt tccggcaggccgtggaggac gaccggtacg gcgagttcct cgacgtcctc 3240 gccgaagcct ccgcgttccgcccgcagttc gcctcgcccg aggcctgctc ggagcggctc 3300 gacccggtgc tgctcgccggcggtccgacg gaccgggcgg aaggccgtgc cgttctcgtc 3360 ggctgcaccg gcaccgcggcgaacggcggc ccgcacgagt tcctgcggct cagcacctcc 3420 ttccaggagg agcgggacttcctcgccgta cctctccccg gctacggcac gggtacgggc 3480 accggcacgg ccctcctcccggccgatctc gacaccgcgc tcgacgccca ggcccgggcg 3540 atcctccggg ccgccggggacgccccggtc gtcctgctcg ggcactccgg cggcgccctg 3600 ctcgcgcacg agctggccttccgcctggag cgggcgcacg gcgcgccgcc ggccgggatc 3660 gtcctggtcg acccctatccgccgggccat caggagccca tcgaggtgtg gagcaggcag 3720 ctgggcgagg gcctgttcgcgggcgagctg gagccgatgt ccgatgcgcg gctgctggcc 3780 atgggccggt acgcgcggttcctcgccggc ccgcggccgg gccgcagcag cgcgcccgtg 3840 cttctggtcc gtgcctccgaaccgctgggc gactggcagg aggagcgggg cgactggcgt 3900 gcccactggg accttccgcacaccgtcgcg gacgtgccgg gcgaccactt cacgatgatg 3960 cgggaccacg cgccggccgtcgccgaggcc gtcctctcct ggctcgacgc catcgagggc 4020 atcgaggggg cgggcaagtg a4041 37 1346 PRT Streptomyces venezuelae 37 Met Thr Ser Ser Asn Glu GlnLeu Val Asp Ala Leu Arg Ala Ser Leu 1 5 10 15 Lys Glu Asn Glu Glu LeuArg Lys Glu Ser Arg Arg Arg Ala Asp Arg 20 25 30 Arg Gln Glu Pro Met AlaIle Val Gly Met Ser Cys Arg Phe Ala Gly 35 40 45 Gly Ile Arg Ser Pro GluAsp Leu Trp Asp Ala Val Ala Ala Gly Lys 50 55 60 Asp Leu Val Ser Glu ValPro Glu Glu Arg Gly Trp Asp Ile Asp Ser 65 70 75 80 Leu Tyr Asp Pro ValPro Gly Arg Lys Gly Thr Thr Tyr Val Arg Asn 85 90 95 Ala Ala Phe Leu AspAsp Ala Ala Gly Phe Asp Ala Ala Phe Phe Gly 100 105 110 Ile Ser Pro ArgGlu Ala Leu Ala Met Asp Pro Gln Gln Arg Gln Leu 115 120 125 Leu Glu AlaSer Trp Glu Val Phe Glu Arg Ala Gly Ile Asp Pro Ala 130 135 140 Ser ValArg Gly Thr Asp Val Gly Val Tyr Val Gly Cys Gly Tyr Gln 145 150 155 160Asp Tyr Ala Pro Asp Ile Arg Val Ala Pro Glu Gly Thr Gly Gly Tyr 165 170175 Val Val Thr Gly Asn Ser Ser Ala Val Ala Ser Gly Arg Ile Ala Tyr 180185 190 Ser Leu Gly Leu Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser195 200 205 Ser Ser Leu Val Ala Leu His Leu Ala Leu Lys Gly Leu Arg AsnGly 210 215 220 Asp Cys Ser Thr Ala Leu Val Gly Gly Val Ala Val Leu AlaThr Pro 225 230 235 240 Gly Ala Phe Ile Glu Phe Ser Ser Gln Gln Ala MetAla Ala Asp Gly 245 250 255 Arg Thr Lys Gly Phe Ala Ser Ala Ala Asp GlyLeu Ala Trp Gly Glu 260 265 270 Gly Val Ala Val Leu Leu Leu Glu Arg LeuSer Asp Ala Arg Arg Lys 275 280 285 Gly His Arg Val Leu Ala Val Val ArgGly Ser Ala Ile Asn Gln Asp 290 295 300 Gly Ala Ser Asn Gly Leu Thr AlaPro His Gly Pro Ser Gln Gln His 305 310 315 320 Leu Ile Arg Gln Ala LeuAla Asp Ala Arg Leu Thr Ser Ser Asp Val 325 330 335 Asp Val Val Glu GlyHis Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile 340 345 350 Glu Ala Gln AlaLeu Leu Ala Thr Tyr Gly Gln Gly Arg Ala Pro Gly 355 360 365 Gln Pro LeuArg Leu Gly Thr Leu Lys Ser Asn Ile Gly His Thr Gln 370 375 380 Ala AlaSer Gly Val Ala Gly Val Ile Lys Met Val Gln Ala Leu Arg 385 390 395 400His Gly Val Leu Pro Lys Thr Leu His Val Asp Glu Pro Thr Asp Gln 405 410415 Val Asp Trp Ser Ala Gly Ser Val Glu Leu Leu Thr Glu Ala Val Asp 420425 430 Trp Pro Glu Arg Pro Gly Arg Leu Arg Arg Ala Gly Val Ser Ala Phe435 440 445 Gly Val Gly Gly Thr Asn Ala His Val Val Leu Glu Glu Ala ProAla 450 455 460 Val Glu Glu Ser Pro Ala Val Glu Pro Pro Ala Gly Gly GlyVal Val 465 470 475 480 Pro Trp Pro Val Ser Ala Lys Thr Ser Ala Ala LeuAsp Ala Gln Ile 485 490 495 Gly Gln Leu Ala Ala Tyr Ala Glu Asp Arg ThrAsp Val Asp Pro Ala 500 505 510 Val Ala Ala Arg Ala Leu Val Asp Ser ArgThr Ala Met Glu His Arg 515 520 525 Ala Val Ala Val Gly Asp Ser Arg GluAla Leu Arg Asp Ala Leu Arg 530 535 540 Met Pro Glu Gly Leu Val Arg GlyThr Val Thr Asp Pro Gly Arg Val 545 550 555 560 Ala Phe Val Phe Pro GlyGln Gly Thr Gln Trp Ala Gly Met Gly Ala 565 570 575 Glu Leu Leu Asp SerSer Pro Glu Phe Ala Ala Ala Met Ala Glu Cys 580 585 590 Glu Thr Ala LeuSer Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val 595 600 605 Arg Gln AlaPro Ser Ala Pro Thr Leu Asp Arg Val Asp Val Val Gln 610 615 620 Pro ValThr Phe Ala Val Met Val Ser Leu Ala Lys Val Trp Gln His 625 630 635 640His Gly Ile Thr Pro Glu Ala Val Ile Gly His Ser Gln Gly Glu Ile 645 650655 Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg 660665 670 Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys675 680 685 Gly Gly Met Ile Ser Leu Ala Leu Ser Glu Glu Ala Thr Arg GlnArg 690 695 700 Ile Glu Asn Leu His Gly Leu Ser Ile Ala Ala Val Asn GlyPro Thr 705 710 715 720 Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile GlnGlu Leu Ala Gln 725 730 735 Ala Cys Glu Ala Asp Gly Ile Arg Ala Arg IleIle Pro Val Asp Tyr 740 745 750 Ala Ser His Ser Ala His Val Glu Thr IleGlu Asn Glu Leu Ala Asp 755 760 765 Val Leu Ala Gly Leu Ser Pro Gln ThrPro Gln Val Pro Phe Phe Ser 770 775 780 Thr Leu Glu Gly Thr Trp Ile ThrGlu Pro Ala Leu Asp Gly Gly Tyr 785 790 795 800 Trp Tyr Arg Asn Leu ArgHis Arg Val Gly Phe Ala Pro Ala Val Glu 805 810 815 Thr Leu Ala Thr AspGlu Gly Phe Thr His Phe Ile Glu Val Ser Ala 820 825 830 His Pro Val LeuThr Met Thr Leu Pro Asp Lys Val Thr Gly Leu Ala 835 840 845 Thr Leu ArgArg Glu Asp Gly Gly Gln His Arg Leu Thr Thr Ser Leu 850 855 860 Ala GluAla Trp Ala Asn Gly Leu Ala Leu Asp Trp Ala Ser Leu Leu 865 870 875 880Pro Ala Thr Gly Ala Leu Ser Pro Ala Val Pro Asp Leu Pro Thr Tyr 885 890895 Ala Phe Gln His Arg Ser Tyr Trp Ile Ser Pro Ala Gly Pro Gly Glu 900905 910 Ala Pro Ala His Thr Ala Ser Gly Arg Glu Ala Val Ala Glu Thr Gly915 920 925 Leu Ala Trp Gly Pro Gly Ala Glu Asp Leu Asp Glu Glu Gly ArgArg 930 935 940 Ser Ala Val Leu Ala Met Val Met Arg Gln Ala Ala Ser ValLeu Arg 945 950 955 960 Cys Asp Ser Pro Glu Glu Val Pro Val Asp Arg ProLeu Arg Glu Ile 965 970 975 Gly Phe Asp Ser Leu Thr Ala Val Asp Phe ArgAsn Arg Val Asn Arg 980 985 990 Leu Thr Gly Leu Gln Leu Pro Pro Thr ValVal Phe Gln His Pro Thr 995 1000 1005 Pro Val Ala Leu Ala Glu Arg IleSer Asp Glu Leu Ala Glu Arg Asn 1010 1015 1020 Trp Ala Val Ala Glu ProSer Asp His Glu Gln Ala Glu Glu Glu Lys 1025 1030 1035 1040 Ala Ala AlaPro Ala Gly Ala Arg Ser Gly Ala Asp Thr Gly Ala Gly 1045 1050 1055 AlaGly Met Phe Arg Ala Leu Phe Arg Gln Ala Val Glu Asp Asp Arg 1060 10651070 Tyr Gly Glu Phe Leu Asp Val Leu Ala Glu Ala Ser Ala Phe Arg Pro1075 1080 1085 Gln Phe Ala Ser Pro Glu Ala Cys Ser Glu Arg Leu Asp ProVal Leu 1090 1095 1100 Leu Ala Gly Gly Pro Thr Asp Arg Ala Glu Gly ArgAla Val Leu Val 1105 1110 1115 1120 Gly Cys Thr Gly Thr Ala Ala Asn GlyGly Pro His Glu Phe Leu Arg 1125 1130 1135 Leu Ser Thr Ser Phe Gln GluGlu Arg Asp Phe Leu Ala Val Pro Leu 1140 1145 1150 Pro Gly Tyr Gly ThrGly Thr Gly Thr Gly Thr Ala Leu Leu Pro Ala 1155 1160 1165 Asp Leu AspThr Ala Leu Asp Ala Gln Ala Arg Ala Ile Leu Arg Ala 1170 1175 1180 AlaGly Asp Ala Pro Val Val Leu Leu Gly His Ser Gly Gly Ala Leu 1185 11901195 1200 Leu Ala His Glu Leu Ala Phe Arg Leu Glu Arg Ala His Gly AlaPro 1205 1210 1215 Pro Ala Gly Ile Val Leu Val Asp Pro Tyr Pro Pro GlyHis Gln Glu 1220 1225 1230 Pro Ile Glu Val Trp Ser Arg Gln Leu Gly GluGly Leu Phe Ala Gly 1235 1240 1245 Glu Leu Glu Pro Met Ser Asp Ala ArgLeu Leu Ala Met Gly Arg Tyr 1250 1255 1260 Ala Arg Phe Leu Ala Gly ProArg Pro Gly Arg Ser Ser Ala Pro Val 1265 1270 1275 1280 Leu Leu Val ArgAla Ser Glu Pro Leu Gly Asp Trp Gln Glu Glu Arg 1285 1290 1295 Gly AspTrp Arg Ala His Trp Asp Leu Pro His Thr Val Ala Asp Val 1300 1305 1310Pro Gly Asp His Phe Thr Met Met Arg Asp His Ala Pro Ala Val Ala 13151320 1325 Glu Ala Val Leu Ser Trp Leu Asp Ala Ile Glu Gly Ile Glu GlyAla 1330 1335 1340 Gly Lys 1345 38 1251 DNA Streptomyces venezuelae 38gtgcgccgta cccagcaggg aacgaccgct tctcccccgg tactcgacct cggggccctg 60gggcaggatt tcgcggccga tccgtatccg acgtacgcga gactgcgtgc cgagggtccg 120gcccaccggg tgcgcacccc cgagggggac gaggtgtggc tggtcgtcgg ctacgaccgg 180gcgcgggcgg tcctcgccga tccccggttc agcaaggact ggcgcaactc cacgactccc 240ctgaccgagg ccgaggccgc gctcaaccac aacatgctgg agtccgaccc gccgcggcac 300acccggctgc gcaagctggt ggcccgtgag ttcaccatgc gccgggtcga gttgctgcgg 360ccccgggtcc aggagatcgt cgacgggctc gtggacgcca tgctggcggc gcccgacggc 420cgcgccgatc tgatggagtc cctggcctgg ccgctgccga tcaccgtgat ctccgaactc 480ctcggcgtgc ccgagccgga ccgcgccgcc ttccgcgtct ggaccgacgc cttcgtcttc 540ccggacgatc ccgcccaggc ccagaccgcc atggccgaga tgagcggcta tctctcccgg 600ctcatcgact ccaagcgcgg gcaggacggc gaggacctgc tcagcgcgct cgtgcggacc 660agcgacgagg acggctcccg gctgacctcc gaggagctgc tcggtatggc ccacatcctg 720ctcgtcgcgg ggcacgagac cacggtcaat ctgatcgcca acggcatgta cgcgctgctc 780tcgcaccccg accagctggc cgccctgcgg gccgacatga cgctcttgga cggcgcggtg 840gaggagatgt tgcgctacga gggcccggtg gaatccgcga cctaccgctt cccggtcgag 900cccgtcgacc tggacggcac ggtcatcccg gccggtgaca cggtcctcgt cgtcctggcc 960gacgcccacc gcacccccga gcgcttcccg gacccgcacc gcttcgacat ccgccgggac 1020accgccggcc atctcgcctt cggccacggc atccacttct gcatcggcgc ccccttggcc 1080cggttggagg cccggatcgc cgtccgcgcc cttctcgaac gctgcccgga cctcgccctg 1140gacgtctccc ccggcgaact cgtgtggtat ccgaacccga tgattcgcgg gctcaaggcc 1200ctgccgatcc gctggcggcg aggacgggag gcgggccgcc gtaccggttg a 1251 39 416 PRTStreptomyces venezuelae 39 Met Arg Arg Thr Gln Gln Gly Thr Thr Ala SerPro Pro Val Leu Asp 1 5 10 15 Leu Gly Ala Leu Gly Gln Asp Phe Ala AlaAsp Pro Tyr Pro Thr Tyr 20 25 30 Ala Arg Leu Arg Ala Glu Gly Pro Ala HisArg Val Arg Thr Pro Glu 35 40 45 Gly Asp Glu Val Trp Leu Val Val Gly TyrAsp Arg Ala Arg Ala Val 50 55 60 Leu Ala Asp Pro Arg Phe Ser Lys Asp TrpArg Asn Ser Thr Thr Pro 65 70 75 80 Leu Thr Glu Ala Glu Ala Ala Leu AsnHis Asn Met Leu Glu Ser Asp 85 90 95 Pro Pro Arg His Thr Arg Leu Arg LysLeu Val Ala Arg Glu Phe Thr 100 105 110 Met Arg Arg Val Glu Leu Leu ArgPro Arg Val Gln Glu Ile Val Asp 115 120 125 Gly Leu Val Asp Ala Met LeuAla Ala Pro Asp Gly Arg Ala Asp Leu 130 135 140 Met Glu Ser Leu Ala TrpPro Leu Pro Ile Thr Val Ile Ser Glu Leu 145 150 155 160 Leu Gly Val ProGlu Pro Asp Arg Ala Ala Phe Arg Val Trp Thr Asp 165 170 175 Ala Phe ValPhe Pro Asp Asp Pro Ala Gln Ala Gln Thr Ala Met Ala 180 185 190 Glu MetSer Gly Tyr Leu Ser Arg Leu Ile Asp Ser Lys Arg Gly Gln 195 200 205 AspGly Glu Asp Leu Leu Ser Ala Leu Val Arg Thr Ser Asp Glu Asp 210 215 220Gly Ser Arg Leu Thr Ser Glu Glu Leu Leu Gly Met Ala His Ile Leu 225 230235 240 Leu Val Ala Gly His Glu Thr Thr Val Asn Leu Ile Ala Asn Gly Met245 250 255 Tyr Ala Leu Leu Ser His Pro Asp Gln Leu Ala Ala Leu Arg AlaAsp 260 265 270 Met Thr Leu Leu Asp Gly Ala Val Glu Glu Met Leu Arg TyrGlu Gly 275 280 285 Pro Val Glu Ser Ala Thr Tyr Arg Phe Pro Val Glu ProVal Asp Leu 290 295 300 Asp Gly Thr Val Ile Pro Ala Gly Asp Thr Val LeuVal Val Leu Ala 305 310 315 320 Asp Ala His Arg Thr Pro Glu Arg Phe ProAsp Pro His Arg Phe Asp 325 330 335 Ile Arg Arg Asp Thr Ala Gly His LeuAla Phe Gly His Gly Ile His 340 345 350 Phe Cys Ile Gly Ala Pro Leu AlaArg Leu Glu Ala Arg Ile Ala Val 355 360 365 Arg Ala Leu Leu Glu Arg CysPro Asp Leu Ala Leu Asp Val Ser Pro 370 375 380 Gly Glu Leu Val Trp TyrPro Asn Pro Met Ile Arg Gly Leu Lys Ala 385 390 395 400 Leu Pro Ile ArgTrp Arg Arg Gly Arg Glu Ala Gly Arg Arg Thr Gly 405 410 415 40 2787 DNAStreptomyces venezuelae 40 atgaatctgg tggaacgcga cggggagata gcccatctcagggccgttct tgacgcatcc 60 gccgcaggtg acgggacgct cttactcgtc tccggaccggccggcagcgg gaagacggag 120 ctgctgcggt cgctccgccg gctggccgcc gagcgggagacccccgtctg gtcggtccgg 180 gcgctgccgg gtgaccgcga catccccctg ggcgtcctctgccagttact ccgcagcgcc 240 gaacaacacg gtgccgacac ctccgccgtc cgcgacctgctggacgccgc ctcgcggcgg 300 gccggaaacc tcacctcccc cgccgacgcg ccgctccgcgtcgacgagac acaccgcctg 360 cacgactggc tgctctccgt ctcccgccgc accccgttcctcgtcgccgt cgacgacctg 420 acccacgccg acaccgcgtc cctgaggttc ctcctgtactgcgccgccca ccacgaccag 480 ggcggcatcg gcttcgtcat gaccgagcgg gcctcgcagcgcgccggata ccgggtgttc 540 cgcgccgagc tgctccgcca gccgcactgc cgcaacatgtggctctccgg gcttcccccc 600 agcggggtac gccagttact cgcccactac tacggccccgaggccgccga gcggcgggcc 660 cccgcgtacc acgcgacgac cggcgggaac ccgctgctcctgcgggcgct gacccaggac 720 cggcaggcct cccacaccac cctcggcgcg gccggcggcgacgagcccgt ccacggcgac 780 gccttcgccc aggccgtcct cgactgcctg caccgcagcgccgagggcac actggagacc 840 gcccgctggc tcgcggtcct cgaacagtcc gacccgctcctggtggagcg gctcacggga 900 acgaccgccg ccgccgtcga gcgccacatc caggagctcgccgccatcgg cctcctggac 960 gaggacggca ccctgggaca gcccgcgatc cgcgaggccgccctccagga cctgccggcc 1020 ggcgagcgca ccgaactgca ccggcgcgcc gcggagcagctgcaccggga cggcgccgac 1080 gaggacaccg tggcccgcca cctgctggtc ggcggcgcccccgacgctcc ctgggcgctg 1140 cccctgctcg aacggggcgc gcagcaggcc ctgttcgacgaccgactcga cgacgccttc 1200 cggatcctcg agttcgccgt gcggtcgagc accgacaacacccagctggc ccgcctcgcc 1260 ccacacctgg tcgcggcctc ctggcggatg aacccgcacatgacgacccg ggccctcgca 1320 ctcttcgacc ggctcctgag cggtgaactg ccgcccagccacccggtcat ggccctgatc 1380 cgctgcctcg tctggtacgg gcggctgccc gaggccgccgacgcgctgtc ccggctgcgg 1440 cccagctccg acaacgatgc cttggagctg tcgctcacccggatgtggct cgcggcgctg 1500 tgcccgccgc tcctggagtc cctgccggcc acgccggagccggagcgggg tcccgtcccc 1560 gtacggctcg cgccgcggac gaccgcgctc caggcccaggccggcgtctt ccagcggggc 1620 ccggacaacg cctcggtcgc gcaggccgaa cagatcctgcagggctgccg gctgtcggag 1680 gagacgtacg aggccctgga gacggccctc ttggtcctcgtccacgccga ccggctcgac 1740 cgggcgctgt tctggtcgga cgccctgctc gccgaggccgtggagcggcg gtcgctcggc 1800 tgggaggcgg tcttcgccgc gacccgggcg atgatcgcgatccgctgcgg cgacctcccg 1860 acggcgcggg agcgggccga gctggcgctc tcccacgcggcgccggagag ctggggcctc 1920 gccgtgggca tgcccctctc cgcgctgctg ctcgcctgcacggaggccgg cgagtacgaa 1980 caggcggagc gggtcctgcg gcagccggtg ccggacgcgatgttcgactc gcggcacggc 2040 atggagtaca tgcacgcccg gggccgctac tggctggcgacgggccggct gcacgcggcg 2100 ctgggcgagt tcatgctctg cggggagatc ctgggcagctggaacctcga ccagccctcg 2160 atcgtgccct ggcggacctc cgccgccgag gtgtacctgcggctcggcaa ccgccagaag 2220 gccagggcgc tggccgaggc ccagctcgcc ctggtgcggcccgggcgctc ccgcacccgg 2280 ggtctcaccc tgcgggtcct ggcggcggcg gtggacggccagcaggcgga gcggctgcac 2340 gccgaggcgg tcgacatgct gcacgacagc ggcgaccggctcgaacacgc ccgcgcgctc 2400 gccgggatga gccgccacca gcaggcccag ggggacaactaccgggcgag gatgacggcg 2460 cggctcgccg gcgacatggc gtgggcctgc ggcgcgtacccgctggccga ggagatcgtg 2520 ccgggccgcg gcggccgccg ggcgaaggcg gtgagcacggagctggaact gccgggcggc 2580 ccggacgtcg gcctgctctc ggaggccgaa cgccgggtggcggccctggc agcccgagga 2640 ttgacgaacc gccagatagc gcgccggctc tgcgtcaccgcgagcacggt cgaacagcac 2700 ctgacgcgcg tctaccgcaa actgaacgtg acccgccgagcagacctccc gatcagcctc 2760 gcccaggaca agtccgtcac ggcctga 2787 41 928 PRTStreptomyces venezuelae 41 Met Asn Leu Val Glu Arg Asp Gly Glu Ile AlaHis Leu Arg Ala Val 1 5 10 15 Leu Asp Ala Ser Ala Ala Gly Asp Gly ThrLeu Leu Leu Val Ser Gly 20 25 30 Pro Ala Gly Ser Gly Lys Thr Glu Leu LeuArg Ser Leu Arg Arg Leu 35 40 45 Ala Ala Glu Arg Glu Thr Pro Val Trp SerVal Arg Ala Leu Pro Gly 50 55 60 Asp Arg Asp Ile Pro Leu Gly Val Leu CysGln Leu Leu Arg Ser Ala 65 70 75 80 Glu Gln His Gly Ala Asp Thr Ser AlaVal Arg Asp Leu Leu Asp Ala 85 90 95 Ala Ser Arg Arg Ala Gly Asn Leu ThrSer Pro Ala Asp Ala Pro Leu 100 105 110 Arg Val Asp Glu Thr His Arg LeuHis Asp Trp Leu Leu Ser Val Ser 115 120 125 Arg Arg Thr Pro Phe Leu ValAla Val Asp Asp Leu Thr His Ala Asp 130 135 140 Thr Ala Ser Leu Arg PheLeu Leu Tyr Cys Ala Ala His His Asp Gln 145 150 155 160 Gly Gly Ile GlyPhe Val Met Thr Glu Arg Ala Ser Gln Arg Ala Gly 165 170 175 Tyr Arg ValPhe Arg Ala Glu Leu Leu Arg Gln Pro His Cys Arg Asn 180 185 190 Met TrpLeu Ser Gly Leu Pro Pro Ser Gly Val Arg Gln Leu Leu Ala 195 200 205 HisTyr Tyr Gly Pro Glu Ala Ala Glu Arg Arg Ala Pro Ala Tyr His 210 215 220Ala Thr Thr Gly Gly Asn Pro Leu Leu Leu Arg Ala Leu Thr Gln Asp 225 230235 240 Arg Gln Ala Ser His Thr Thr Leu Gly Ala Ala Gly Gly Asp Glu Pro245 250 255 Val His Gly Asp Ala Phe Ala Gln Ala Val Leu Asp Cys Leu HisArg 260 265 270 Ser Ala Glu Gly Thr Leu Glu Thr Ala Arg Trp Leu Ala ValLeu Glu 275 280 285 Gln Ser Asp Pro Leu Leu Val Glu Arg Leu Thr Gly ThrThr Ala Ala 290 295 300 Ala Val Glu Arg His Ile Gln Glu Leu Ala Ala IleGly Leu Leu Asp 305 310 315 320 Glu Asp Gly Thr Leu Gly Gln Pro Ala IleArg Glu Ala Ala Leu Gln 325 330 335 Asp Leu Pro Ala Gly Glu Arg Thr GluLeu His Arg Arg Ala Ala Glu 340 345 350 Gln Leu His Arg Asp Gly Ala AspGlu Asp Thr Val Ala Arg His Leu 355 360 365 Leu Val Gly Gly Ala Pro AspAla Pro Trp Ala Leu Pro Leu Leu Glu 370 375 380 Arg Gly Ala Gln Gln AlaLeu Phe Asp Asp Arg Leu Asp Asp Ala Phe 385 390 395 400 Arg Ile Leu GluPhe Ala Val Arg Ser Ser Thr Asp Asn Thr Gln Leu 405 410 415 Ala Arg LeuAla Pro His Leu Val Ala Ala Ser Trp Arg Met Asn Pro 420 425 430 His MetThr Thr Arg Ala Leu Ala Leu Phe Asp Arg Leu Leu Ser Gly 435 440 445 GluLeu Pro Pro Ser His Pro Val Met Ala Leu Ile Arg Cys Leu Val 450 455 460Trp Tyr Gly Arg Leu Pro Glu Ala Ala Asp Ala Leu Ser Arg Leu Arg 465 470475 480 Pro Ser Ser Asp Asn Asp Ala Leu Glu Leu Ser Leu Thr Arg Met Trp485 490 495 Leu Ala Ala Leu Cys Pro Pro Leu Leu Glu Ser Leu Pro Ala ThrPro 500 505 510 Glu Pro Glu Arg Gly Pro Val Pro Val Arg Leu Ala Pro ArgThr Thr 515 520 525 Ala Leu Gln Ala Gln Ala Gly Val Phe Gln Arg Gly ProAsp Asn Ala 530 535 540 Ser Val Ala Gln Ala Glu Gln Ile Leu Gln Gly CysArg Leu Ser Glu 545 550 555 560 Glu Thr Tyr Glu Ala Leu Glu Thr Ala LeuLeu Val Leu Val His Ala 565 570 575 Asp Arg Leu Asp Arg Ala Leu Phe TrpSer Asp Ala Leu Leu Ala Glu 580 585 590 Ala Val Glu Arg Arg Ser Leu GlyTrp Glu Ala Val Phe Ala Ala Thr 595 600 605 Arg Ala Met Ile Ala Ile ArgCys Gly Asp Leu Pro Thr Ala Arg Glu 610 615 620 Arg Ala Glu Leu Ala LeuSer His Ala Ala Pro Glu Ser Trp Gly Leu 625 630 635 640 Ala Val Gly MetPro Leu Ser Ala Leu Leu Leu Ala Cys Thr Glu Ala 645 650 655 Gly Glu TyrGlu Gln Ala Glu Arg Val Leu Arg Gln Pro Val Pro Asp 660 665 670 Ala MetPhe Asp Ser Arg His Gly Met Glu Tyr Met His Ala Arg Gly 675 680 685 ArgTyr Trp Leu Ala Thr Gly Arg Leu His Ala Ala Leu Gly Glu Phe 690 695 700Met Leu Cys Gly Glu Ile Leu Gly Ser Trp Asn Leu Asp Gln Pro Ser 705 710715 720 Ile Val Pro Trp Arg Thr Ser Ala Ala Glu Val Tyr Leu Arg Leu Gly725 730 735 Asn Arg Gln Lys Ala Arg Ala Leu Ala Glu Ala Gln Leu Ala LeuVal 740 745 750 Arg Pro Gly Arg Ser Arg Thr Arg Gly Leu Thr Leu Arg ValLeu Ala 755 760 765 Ala Ala Val Asp Gly Gln Gln Ala Glu Arg Leu His AlaGlu Ala Val 770 775 780 Asp Met Leu His Asp Ser Gly Asp Arg Leu Glu HisAla Arg Ala Leu 785 790 795 800 Ala Gly Met Ser Arg His Gln Gln Ala GlnGly Asp Asn Tyr Arg Ala 805 810 815 Arg Met Thr Ala Arg Leu Ala Gly AspMet Ala Trp Ala Cys Gly Ala 820 825 830 Tyr Pro Leu Ala Glu Glu Ile ValPro Gly Arg Gly Gly Arg Arg Ala 835 840 845 Lys Ala Val Ser Thr Glu LeuGlu Leu Pro Gly Gly Pro Asp Val Gly 850 855 860 Leu Leu Ser Glu Ala GluArg Arg Val Ala Ala Leu Ala Ala Arg Gly 865 870 875 880 Leu Thr Asn ArgGln Ile Ala Arg Arg Leu Cys Val Thr Ala Ser Thr 885 890 895 Val Glu GlnHis Leu Thr Arg Val Tyr Arg Lys Leu Asn Val Thr Arg 900 905 910 Arg AlaAsp Leu Pro Ile Ser Leu Ala Gln Asp Lys Ser Val Thr Ala 915 920 925 4227 DNA Streptomyces venezuelae 42 cccgaattcg ccgccgccat ggccgaa 27 43 35DNA Streptomyces venezuelae 43 gtgatgcatc ggctcggcga cggcccagtt ccgct 3544 45 DNA Streptomyces venezuelae 44 atgcatcacc accaccacca ctgagggggcgggcaagtga ccgac 45 45 33 DNA Streptomyces venezuelae 45 gggtctagagctgcaccggc gggtcgtagc gga 33 46 27 DNA Streptomyces venezuelae 46gaattcatcg agggggcggg caagtga 27 47 31 DNA Streptomyces venezuelae 47atgcatcagg tcgtcggtca ccgtgggttc t 31 48 30 DNA Streptomyces venezuelae48 ggatccgcgc cgggatgttc cgcgccctgt 30 49 31 DNA Streptomyces venezuelae49 aaaatgcatc agaggtctgt cggtcacttg c 31 50 41 DNA Streptomycesvenezuelae 50 aaaagatctt gatggtgcag gcgctgcgcc acggggtgct g 41 51 32 DNAStreptomyces venezuelae 51 aaaagatctc caacgaacag ttggtggacg ct 32 52 846DNA Streptomyces venezuelae 52 gtgaccgaca gacctctgaa cgtggacagcggactgtgga tccggcgctt ccaccccgcg 60 ccgaacagcg cggtgcggct ggtctgcctgccgcacgccg gcggctccgc cagctacttc 120 ttccgcttct cggaggagct gcacccctccgtcgaggccc tgtcggtgca gtatccgggc 180 cgccaggacc ggcgtgccga gccgtgtctggagagcgtcg aggagctcgc cgagcatgtg 240 gtcgcggcca ccgaaccctg gtggcaggagggccggctgg ccttcttcgg gcacagcctc 300 ggcgcctccg tcgccttcga gacggcccgcatcctggaac agcggcacgg ggtacggccc 360 gagggcctgt acgtctccgg tcggcgcgccccgtcgctgg cgccggaccg gctcgtccac 420 cagctggacg accgggcgtt cctggccgagatccggcggc tcagcggcac cgacgagcgg 480 ttcctccagg acgacgagct gctgcggctggtgctgcccg cgctgcgcag cgactacaag 540 gcggcggaga cgtacctgca ccggccgtccgccaagctca cctgcccggt gatggccctg 600 gccggcgacc gtgacccgaa ggcgccgctgaacgaggtgg ccgagtggcg tcggcacacc 660 agcgggccgt tctgcctccg ggcgtactccggcggccact tctacctcaa cgaccagtgg 720 cacgagatct gcaacgacat ctccgaccacctgctcgtca cccgcggcgc gcccgatgcc 780 cgcgtcgtgc agcccccgac cagccttatcgaaggagcgg cgaagagatg gcagaaccca 840 cggtga 846 53 281 PRT Streptomycesvenezuelae 53 Met Thr Asp Arg Pro Leu Asn Val Asp Ser Gly Leu Trp IleArg Arg 1 5 10 15 Phe His Pro Ala Pro Asn Ser Ala Val Arg Leu Val CysLeu Pro His 20 25 30 Ala Gly Gly Ser Ala Ser Tyr Phe Phe Arg Phe Ser GluGlu Leu His 35 40 45 Pro Ser Val Glu Ala Leu Ser Val Gln Tyr Pro Gly ArgGln Asp Arg 50 55 60 Arg Ala Glu Pro Cys Leu Glu Ser Val Glu Glu Leu AlaGlu His Val 65 70 75 80 Val Ala Ala Thr Glu Pro Trp Trp Gln Glu Gly ArgLeu Ala Phe Phe 85 90 95 Gly His Ser Leu Gly Ala Ser Val Ala Phe Glu ThrAla Arg Ile Leu 100 105 110 Glu Gln Arg His Gly Val Arg Pro Glu Gly LeuTyr Val Ser Gly Arg 115 120 125 Arg Ala Pro Ser Leu Ala Pro Asp Arg LeuVal His Gln Leu Asp Asp 130 135 140 Arg Ala Phe Leu Ala Glu Ile Arg ArgLeu Ser Gly Thr Asp Glu Arg 145 150 155 160 Phe Leu Gln Asp Asp Glu LeuLeu Arg Leu Val Leu Pro Ala Leu Arg 165 170 175 Ser Asp Tyr Lys Ala AlaGlu Thr Tyr Leu His Arg Pro Ser Ala Lys 180 185 190 Leu Thr Cys Pro ValMet Ala Leu Ala Gly Asp Arg Asp Pro Lys Ala 195 200 205 Pro Leu Asn GluVal Ala Glu Trp Arg Arg His Thr Ser Gly Pro Phe 210 215 220 Cys Leu ArgAla Tyr Ser Gly Gly His Phe Tyr Leu Asn Asp Gln Trp 225 230 235 240 HisGlu Ile Cys Asn Asp Ile Ser Asp His Leu Leu Val Thr Arg Gly 245 250 255Ala Pro Asp Ala Arg Val Val Gln Pro Pro Thr Ser Leu Ile Glu Gly 260 265270 Ala Ala Lys Arg Trp Gln Asn Pro Arg 275 280

What is claimed is:
 1. An isolated and purified nucleic acid segmentcomprising a nucleic acid sequence comprising a desosamine biosyntheticgene cluster, a fragment or a biologically active variant thereof,wherein the nucleic acid sequence is not derived from the eryC genecluster of Saccharopolyspora erythraea or Streptomyces antibioticus. 2.The isolated and purified nucleic acid segment of claim 1 comprising SEQID NO:3.
 3. The isolated and purified nucleic acid segment of claim 1which encodes DesI, DesII, DesIII, DesIV, DesV, DesVI, DesVII, DesVIIIor DesR.
 4. The isolated and purified nucleic acid segment of any one ofclaims 1 to 3 which is from Streptomyces venezuelae.
 5. An expressioncassette comprising the nucleic acid segment of any one of claims 1 to 4operably linked to a promoter functional in a host cell.
 6. Arecombinant bacterial host cell in which at least a portion of a nucleicacid sequence encoding desosamine is disrupted so as to alter desosaminesynthesis, wherein the nucleic acid sequence which is disrupted is notderived from the eryC gene cluster of Saccharopolyspora erythraea. 7.The host cell of claim 6 wherein the nucleic acid sequence which isdisrupted encodes DesI, DesII, DesIII, DesIV, DesV, DesVI, DesVII,DesVIII or DesR.
 8. A host cell, the genome of which is augmented withthe expression cassette of claim
 5. 9. A product produced by the hostcell of any one of claims 6 to 8 which is not produced by thecorresponding non-recombinant or non-augmented host cell.
 10. Theproduct of claim 9 which comprises a macrolide.
 11. The product of claim9 or 10 which is biologically active.
 12. An isolated and purifiednucleic acid segment comprising a nucleic acid sequence comprising amacrolide biosynthetic gene cluster encoding polypeptides whichsynthesize methymycin, pikromycin, neomethymycin, narbomycin, or acombination thereof, or a biologically active variant or fragmentthereof.
 13. The isolated and purified nucleic acid segment of claim 12comprising SEQ ID NO:5.
 14. The isolated and purified nucleic acidsegment of claim 12 comprising a biologically active variant or fragmentof SEQ ID NO:5.
 15. The isolated and purified nucleic acid segment ofclaim 12 which encodes PikR1, PikR2, PikAI, PikAII, PikAIII, PikAIV,PikAV, PikC or PikD.
 16. The isolated and purified nucleic acid segmentof any one of claims 12 to 15 which is from Streptomyces venezuelae. 17.A host cell, the genome of which is augmented with the nucleic acidsegment of any one of claims 12 to
 16. 18. An isolated and purifiednucleic acid sequence comprising SEQ ID NO:3, SEQ ID NO:5, a fragmentthereof, the complement thereto, or which hybridizes thereto.
 19. Anisolated polypeptide encoded by the nucleic acid segment of any one ofclaims 1 to 4 or 12 to
 16. 20. A recombinant host cell in which a pikAIgene, a pikAII gene, a pikAIII gene, a pikAIV gene, a pikB gene cluster,a pikAV gene cluster, a pikC gene, a pikR1 gene, a pikR2 gene, or acombination thereof, is disrupted so as to alter production ofmethymycin, neomethymycin, pikromycin, narbomycin, or a combinationthereof.
 21. A macrolide or polyketide product produced by the host cellof claim 17 or 20 which is not produced by the correspondingnon-recombinant or non-augmented host cell.
 22. The macrolide orpolyketide of claim 21 which is biologically active.
 23. An isolated andpurified DNA molecule comprising a first DNA segment encoding a firstmodule and a second DNA segment encoding a second module, wherein theDNA segments together encode a recombinant polyhydroxyalkanoate monomersynthase, and wherein at least one DNA segment is derived from the pikAgene cluster of Streptomyces venezuelae.
 24. A method of providing apolyhydroxyalkanoate monomer, comprising: (a) introducing into a hostcell a DNA molecule comprising a DNA segment encoding a recombinantpolyhydroxyalkanoate monomer synthase operably linked to a promoterfunctional in the host cell, wherein the recombinantpolyhydroxyalkanoate monomer synthase comprises a first module and asecond module, and wherein at least one DNA segment is derived from thepikA gene cluster of Streptomyces venezuelae; and (b) expressing the DNAencoding the recombinant polyhydroxyalkanoate monomer synthase in thehost cell so as to generate a polyhydroxyalkanoate monomer.
 25. Arecombinant vector comprising one or more modules of a polyketidesynthase wherein at least one module is from Streptomyces venezuelae.26. The method of claim 24 wherein the first module encodes a fatty acidsynthase.
 27. A method of providing a polyhydroxyalkanoate monomer,comprising: (a) introducing into a host cell a DNA molecule encoding afuision polypeptide, wherein the DNA molecule comprises a first DNAsegment operably linked to a promoter functional in the host cell and asecond DNA segment, wherein at least one DNA segment is derived from thepikA gene cluster of Streptomyces venezuelae; and (b) expressing the DNAin the host cell so as to generate the fusion polypeptide.
 28. The hostcell of claim 8 or 17 the native genome of which does not comprise anintact macrolide biosynthetic gene cluster encoding polypeptides whichsynthesize methymycin, pikromycin, neomethymycin, or narbomycin.
 29. Arecombinant bacterial host cell comprising a deletion of thethioesterase domain of pikeAIV gene.
 30. The recombinant host cell ofclaim 29 further comprising a deletion in the pikAV gene.
 31. Anisolated and purified DNA molecule comprising a DNA segment comprising apikA promoter.
 32. An expression cassette comprising a pikA promoteroperably linked to a DNA molecule comprising a DNA segment comprising anopen reading frame or a portion thereof.
 33. The expression cassette ofclaim 32 wherein the DNA segment encodes the thioesterase domain ofpikAIV.
 34. The expression cassette of claim 32 wherein the DNA segmentencodes the thioesterase II domain of pikAV.
 35. The expression cassetteof claim 33 further comprising an acyl carrier protein domain.
 36. Theexpression cassette of claim 35 further comprising a thioesterase IIdomain.
 37. The expression cassette of claim 35 fuirther comprising anacyl transferase domain.
 38. The expression cassette of claim 37 furthercomprising a β-ketoacyl-acyl carrier protein synthase domain.
 39. Theexpression cassette of any one of claims 32 to 38 wherein the DNAmolecule comprises a second DNA segment comprising the leader sequenceof pikAI operably linked to the first DNA segment.
 40. A host celltransformed with a plasmid comprising the expression cassette of any oneof claims 32 to
 39. 41. The host cell of claim 40 which lacks thethioesterase domain of pikAIV gene cluster and the thioesterase IIdomain of pikAV gene.
 42. A method to alter polyketide chain length,comprising: expressing in a host cell an expression cassette comprisingat least a portion of a DNA segment that encodes a module that catalyzesthe final condensation of a polyketide so as to yield a polyketideproduct which is of a different length relative to a polyketide producedby a host cell which does not express the module, wherein the DNAsegment that encodes an intact module encodes two differentpolypeptides, one of which has a lower molecular weight than the otherpolypeptide.
 43. The method of claim 42 wherein the intact module ispikAmodule
 6. 44. The method of claim 42 wherein the expression cassette ispresent on a plasmid.
 45. The method of claim 42 wherein the host cellis a polyketide-producing host cell.
 46. A product produced by themethod of claim 42 which is not produced by a host cell which does notexpress the module.
 47. A method to prepare a polyketide product,comprising: expressing in a host cell an expression cassette comprisinga promoter operably linked to a DNA segment comprising a portion of afirst polyketide synthase gene so as to yield the product, wherein theexpression cassette is present on a plasmid, wherein the chromosome ofthe host cell comprises at least a portion of a second polyketidesynthase gene, and wherein both portions are operably linked to thenative polyketide promoter of one of the polyketide genes.
 48. Themethod of claim 47 wherein the portions are from the same polyketidegene and wherein the portion on the host cell chromosome is differentthan the portion that is on the plasmid.
 49. The method of claim 48wherein the portions together comprise the entire gene.
 50. The methodof claim 49 wherein the gene is the pikA gene cluster.
 51. A host cell,the genome of which comprises at least a portion of a first polyketidesynthase gene, comprising: a plasmid comprising a promoter operablylinked to a DNA molecule comprising a DNA segment encoding a portion ofa second polyketide synthase gene, wherein both portions are operablylinked to the native promoter of one of the genes, and wherein theexpression of both portions yields a polyketide.
 52. The host cell ofclaim 51 wherein the portions are from the same polyketide gene andwherein the portion on the host cell genome is different than theportion that is on the plasmid.
 53. The host cell of claim 52 whereinthe portions together comprise the entire gene.
 54. The host cell ofclaim 53 wherein the gene is the pikA gene cluster.
 55. A polyketideproduced by the host cell of claim
 51. 56. Use of a product of claim 9,21 or 46 for the manufacture of a medicament for the treatment of apathological condition or symptom in a mammal.
 57. The host cell ofclaim 7 wherein the nucleic acid sequence encoding DesR is disrupted.58. A product produced by the host cell of claim
 57. 59. The host cellof claim 7 wherein the nucleic acid sequence encoding DesI is disrupted.60. A product produced by the host cell of claim 59.